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This  research  project  is  concerned  with  two  distinct  aspects  of  analysis  and  processing  of  sig¬ 
nals  received  at  multiple  sensors  from  multiple  sources  when  the  operating  environment  is  highly 
uncertain  and  unstructured.  In  part  I,  a  general  approach  based  upon  an  independent  component 
decomposition  (ICD)  is  sought  to  be  investigated  involving  as  few  assumptions  as  possible  compared 
i  to  existing  literature.  The  approach  is  sought  to  be  developed  in  conjunction  with  specific,  useful 
applications  such  as  space  and  time  diversity  multiaccess/multiuser  digital  communications  and 
!  multitarget  tracking  using  multi-platform  multisensor  arrays.  In  part  II  focus  is  on  maneuvering 
target  tracking  using  kinematic  models.  This  report  describes  the  progress  made  on  the  above  two 
aspects  of  the  project.  Details  are  provided  in  attached  copies  of  10  papers  —  6  journal  articles 
(accepted/submitted)  and  4  conference  papers. 
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Multichannel/Multisensor  Signal  Processing  In  Uncertain 
Environments  With  Application  To  Multitarget  Tracking 

This  research  project  is  concerned  with  two  distinct  aspects  of  analysis  and  processing  of  sig¬ 
nals  received  at  multiple  sensors  from  multiple  sources  when  the  operating  environment  is  highly 
uncertain  and  unstructured.  In  part  I,  a  general  approach  based  upon  an  independent  component 
decomposition  (ICD)  is  sought  to  be  investigated  involving  as  few  assumptions  as  possible  compared 
to  existing  literature.  The  approach  is  sought  to  be  developed  in  conjunction  with  specific,  useful 
applications  such  as  space  and  time  diversity  multiaccess/multiuser  digital  communications  and 
multitarget  tracking  using  multi-platform  multisensor  arrays.  In  part  II  focus  is  on  maneuvering 
target  tracking  using  kinematic  models. 

WORK  COMPLETED  AND  IN  PROGRESS  (“near  future”)  Progress  has  been  made  on  the 
following  major  aspects  of  the  project: 

1  TRACKING  MANEUVERING  TARGETS  USING  MULTIPLE  KINEMATIC 
MODELS:  We  have  investigated  a  new  method  (interacting  multiple  model  (IMM)  fixed-lag 
smoothing)  for  tracking  a  single  maneuvering  target  in  a  “clean”  environment  (no  clutter). 
This  work  has  been  reported  in  [2].  The  approach  yields  much  improved  performance  when 
compared  to  filtering  at  the  cost  of  a  slight  delay  (one  or  two  sampling  intervals).  We  are 
currently  working  on  the  same  approach  (and  its  variations)  for  tracking  a  single  maneuvering 
target  in  clutter.  We  expect  to  complete  a  journal  paper  on  it  by  the  end  of  June  1998.  Then 
we  plan  to  move  on  to  tracking  multiple  maneuvering  targets  in  clutter. 

2  INDEPENDENT  COMPONENT  DECOMPOSITION  AND  ITS  APPLICATIONS: 

Here  we  have  investigated  several  approaches  for  independent  source  separation,  equaliza¬ 
tion,  channel  estimation  and  independent  component  decomposition.  The  results  have  been 
reported  in  refs.  [1],  [3]-[10].  The  next  step  is  to  focus  exclusively  on  multiple  sources  and 
performance  analysis. 


Journal  Articles  Submitted/ Accepted 

[1]  J.K.  Tugnait,  “On  blind  separation  of  convolutive  mixtures  of  independent  linear  signals  in 
unknown  additive  noise,”  IEEE  Trans.  Signal  Processing,  accepted  4/98;  not  yet  scheduled. 
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On  Blind  Separation  of  Convolutive  Mixtures  of  Independent  Linear  Signals  in 

Unknown  Additive  Noise  ^  ^ 

Jitendra  K.  Tugnait 

Abstract 

Blind  separation  of  independent  signals  (sources)  from  their  linear  convolutive  mixtures  is  considered. 
The  various  signals  are  assumed  to  be  linear  non-Gaussian  but  not  necessarily  i.i.d.  First  an  iterative, 
normalized  higher-order  cumulant  maximization  based  approach  is  exploited  using  the  third-  and/or 
fourth-order  normalized  cumulants  of  the  “beamformed”  data.  It  provides  a  decomposition  of  the  given 
data  at  each  sensor  into  its  independent  signal  components.  In  a  second  approach  higher-order  cumulant 
matching  is  used  to  consistently  estimate  the  MIMO  impulse  response  via  nonlinear  optimization.  In  a 
third  approach  higher-order  cumulants  are  augmented  with  correlations.  For  blind  signal  separation  the 
estimated  channel  is  used  to  decompose  the  received  signal  at  each  sensor  into  its  independent  signal 
components  via  a  Wiener  filter.  Two  illustrative  simulation  examples  are  presented. 

1  Introduction 

Consider  a  discrete-time  FIR  MIMO  system  with  N  outputs  and  M  inputs  given  by 
L 

y(fc)  =  ^F/w(fc-/)  +  n{k)  =  [J^(2)]w(A:)  -|-  n{k)  =  s{k)  +  n{k)  (1-1) 

1=0 

where  J^iz)  =  Ef=oFj2"‘,  y{k)  =  [yi{k):y2{k):---:yN{k)f,  similarly  for  w(fc),  s(fc)  and  n(A:),  Wj{k) 
is  the  j-th  input  at  sampling  time  fc,  yi(k)  is  the  z-th  output,  ni{k)  is  the  additive  measurement  noise, 
and  {Fj}  is  the  system  matrix  impulse  response  (IR).  We  allow  all  of  the  above  variables  to  be  complex¬ 
valued.  We  impose  the  following  conditions; 

(ASl)  N  >  M,  i.e.  there  are  at  least  as  many  outputs  as  inputs. 

(AS2)  Rank{J^(2:)}  =  M  for  any  \z\  =  1. 

(ASS)  The  vector  sequence  w(A:)  is  assumed  to  be  zero-mean  and  i.i.d.  both  temporally  and  spatially. 
Also  assume  that  fourth-cumulant  or  the  third-cumulant  of  w(A:)  is  nonzero. 

(AS4)  The  noise  {n(fc)}  is  a  zero-mean,  stationary  Gaussian  sequence  (with  unknown  correlation 
function)  independent  of  {w(A:)}.  Moreover,  it  is  ergodic. 

Let  the  transfer  function  of  individual  subchannels  be  denoted  by  Fij{z)  (transfer  function  between  the 
i-th  output  and  j-th  input)  having  the  IR  {fij{k)}. 

^The  author  is  with  the  Department  of  Electrical  Engineering,  Auburn  University,  Auburn,  AL  36849-5201,  USA. 

E-mail:  tugnait ©eng.auburn.eduj  Tel.  (334)844-1846;  Fax:  (334)844-1809 

^This  work  was  supported  by  the  National  Science  Foundation  under  Grant  MIP-9312559  and  by  the  Office  of  Naval 

Research  under  Grant  N00014-97-1-0822. 


The  problem  of  blind  separation  of  independent  linear  signals  from  their  convohitivc  mixtures  leads 
to  the  above  mathematical  model.  In  the  convolutive  mixture  problem,  M  independent  non-Gaussian 
signals  Xj{k)  (j  =  1,2,  •  •  -  jM)  are  observed  at  N  sensors  as 

y(A:)  =  mz)]^{k)  +  n{k)  (1-2) 

where  represents  the  convolutive  mixture.  Assume  that 

x(fc)  =  [V(2:)]w(A:)  (1-3) 

where  w{k)  satisfies  (AS3)  and  V(z)  is  diagonal.  From  (1-2)  and  (1-3),  we  obtain  (1-1)  where  J^{z)  = 
l{{z)V{z)  and  we  have  used  (if  needed)  an  FIR  approximation.  Past  work  on  separation  of  convolu¬ 
tive  mixtures  may  be  categorized  into  several  classes:  time-domain  approaches  ([2],  [6],  [8],  [11],  [12]), 
frequency-domain  approaches  ([3]),  adaptive  (recursive)  approaches  ([6],  [8],  [11]),  and  non-recursive 
(batch)  approaches  ([2],  [3],  [12]).  In  this  paper  we  present  time-domain  non-recursive  (batch)  ap¬ 
proaches.  Quite  a  few  of  existing  approaches  are  bmited  either  to  Af  =  JV  =  2  ([3],  [8])  or  to  M  =  A 
([2], [6],  [12]).  In  this  paper  we  consider  a  general  case  of  A  >  M  with  M  arbitrary. 

Let  J^^^{z)  denote  the  i-th  column  of  T{z).  In  our  formulation  of  blind  convolutive  signal  sepa¬ 
ration  problem,  we  are  interested  in  decomposing  the  observations  at  various  sensors  into  its  indepen¬ 
dent  components,  i.e.  in  estimating  [J^^''\z)]wi{k)  for  i  —  1,2,---,M  given  {y(fc)}  without  having  a 
prior  knowledge  of  J^{z).  Our  main  approach  is  to  first  estimate  y^{z)  (Secs.  2-4)  and  then  estimate 
[J^i^[z)]wi{k)  via  Wiener  filtering  (Sec.  5).  Others  have  pursued  a  different  approach  as  follows.  Sup¬ 
pose  that  there  exists  a  MIMO  dynamic  system  £{z)  with  A  inputs  and  M  outputs  such  that  the  overall 
JIL  X  Af  system  T(j2:)  ^(0)^(2)  decouples  the  source  signals.  Following  the  2x2  case  considered  in 

[3],  this  implies  that  we  must  have  {Tij{z)  denotes  the  ij— th  element  of  T{z)) 

0  ^  ^  y  (1-4) 

7^  0  for  i  —  ij 

where  i  =  1,2,  •••,Af;  j  =  l,2,---,Af  and  ij  G  {l,2,---,Af}  such  that  ij  7^  ij  for  j  7^  1.  That  is, 
in  every  column  and  every  row  of  2”(z)  there  is  exactly  one  non-zero  entry.  This  approach  occurs 
in  the  seminal  paper  [1]  and  others  ([2],[3],[8]  and  references  therein).  By  discarding  all  but  one  of 
the  A  entries  of  the  A-vector  [:F(*)(2:)]ti;i(A:),  we  can  get  the  solution  specified  by  (1-4).  The  idea 
of  decomposition  of  y(fc)  into  its  independent  components  [.F(‘)(2:)]uJi(fc)  to  achieve  source  separation 
has  appeared  in  [7]  using  higher-order  statistics  and  in  [11]  using  second-order  statistics.  In  [11]  it  is 

required  that  rank{W(2:)}  —  M  for  any  z  (including  2  =  00  but  excluding  2  =  0)  whereas  our  (ASS) 

leads  to  rank{W(2)}  =  Af  only  for  |2|  =  1;  our  examples  in  Sec.  6  do  not  satisfy  the  assumptions  of  [11]. 
On  the  other  hand,  [11]  does  not  require  the  signals  {x(A:)}  to  be  non-Gaussian  or  linear  whereas  our 
formulation  relies  crucially  on  {x(fc)}  being  linear  non-Gaussian. 

The  assumption  of  linear  non-Gaussian  sources  allows  one  to  treat  the  problem  as  a  (blind)  linear 
system  identification  problem  using  higher-order  statistics.  Therefore,  existing  results  on  blind  system 
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ideutification  ([4],  [5],  [12]  etc.)  become  quite  relevant.  In  [5]  (also  [6])  a  source-iterative,  inverse-filter 
criteria-based  approach  has  been  developed.  It  v/as  shown  in  [5]  that  the  system  matrix  IR  sequence 
{Fj}  can  be  found  up  to  a  post- multiplication  monomial  matrix.  The  approach  of  [5]  does  not  require 
knowledge  of  the  model  order  {L  in  (1-1)  ).  However,  it  yields  biased  IR  estimates  in  noise.  One  of 
the  purposes  of  this  paper  is  to  investigate  alternative  approaches  to  remedy  this  drawback  so  that 
consistent  channel  estimates  may  be  used  for  source  separation  given  dynamic  mixtures.  We  propose 
to  use  a.  cumulant  matching  approach  [4].  Since  higher-order  cumulants  of  Gaussian  processes  vanish, 
cumulant  matching  has  the  potential  of  yielding  unbiased  estimates.  Using  output  cumulants,  closed- 
form  solutions  have  been  given  in  [4]  (and  references  therein)  under  several  restrictive  conditions:  given 
a  finite  impulse  response  {FjJ^q,  model  order  L  is  known,  and  Fq  and  F^^  are  both  of  full  column  rank. 
Quadratic  cumulant  matching  has  also  been  performed  in  [4]  under  the  same  restrictive  conditions. 
Since  cumulant  matching  results  in  a  nonlinear  optimization  problem,  selection  of  good  initial  guesses 
is  crucial.  In  [4]  this  is  accomplished  by  using  the  closed-form  solution.  In  [12]  it  has  been  shown  that 
if  two  models  have  identical  output  cumulants,  then  their  transfer  functions  (hence  impulse  responses) 
are  equivalent  up  to  a  monomial  matrix.  However,  [12]  offers  no  algorithms  for  model  identification. 
In  this  note  we  utilize  the  results  of  the  approach  of  [5]  as  an  initial  guess  for  cumulant  matching  and 
related  approaches  (see  further  remarks  in  Sec.  3.).  Once  the  system  IR  has  been  estimated,  we  design 
an  MMSE  (minimum  mean-square  error)  filter  for  signal  separation  in  Sec.  5  using  the  estimated  IR. 

2  An  Iterative  Solution  Based  on  Inverse-Filter  Criteria  [5] 

Here  we  briefly  review  [5]  whose  analysis  holds  only  for  n{k)  =  0.  Let  CUM4('u;)  denote  the  fourth-order 
cumulant  of  a  complex- valued  random  variable  w,  defined  as 

CUM4(u;)  :=  cum4{'u;, -in*,  xu, to*}  =  F{lto|‘*}  —  2[F{|top}]^  —  |F{to^}|^.  (2-1) 

We  will  use  the  notation  74„i  =  CUM4(tOi(A:))  and  cr^ .  =  F{|tOi(A;)|2}.  Consider  an  1  X  JV  row-vector 
polynomial  equalizer  C^(z),  with  its  j-th  entry  denoted  by  C'j(z),  operating  on  the  data  vector  y(A:). 
Let  the  equalizer  output  be  denoted  by  e(fe)  =  Ya=\  Ci{z)yi{k).  Following  [5]  consider  maximization  of 
the  cost  (an  inverse-filter  criterion) 

J42  :=  |CUM4(e(A:))l  X  [E{\e{k)\^}]-^  (2-2) 

for  designing  a  linear  equalizer  to  recover  one  of  the  inputs.  It  is  shown  [5]  that  when  (2-2)  is  maximized 
w.r.t.  C(z),  then  e{k)  is  given  by 

e{k)  =  dwjg{k  -  ko)  (2-3) 

where  d  is  some  complex  constant,  ko  is  some  integer,  jo  indexes  some  input  out  of  the  given  M  inputs, 
i.e.,  the  equalizer  output  is  a  possibly  scaled  and  shifted  version  of  one  of  the  system  inputs.  It  has 
been  established  in  [5]  that  under  (AS1)-(AS3)  and  no  noise,  such  a  solution  exists  and  if  doubly- 
infinite  equalizers  are  used,  then  all  locally  stable  stationary  points  of  the  given  cost  w.r.t.  the  equalizer 
coefficients  are  also  characterized  by  solutions  such  as  (2-3). 


An  iterative  solution  where  we  iterate  on  input  sequences  one-by-onc  is  summarized  in  Table  1.  In 
practice,  all  the  expectations  in  (T-1)  are  replaced  with  their  sample  averages  over  appropriate  data 
records.  It  has  been  shown  in  [4]  that 

I 

icpresenting  the  contribution  of  {wj^(k)}  to  the  z— th  sensor:  blind  signal  separation. 

Remark  1.  We  may  replace  the  cost  (2-2)  with  ([5])  J32  :=  lCUM3(e(/i;))|[E{le(A:)p}]  where 
CUM3(u))  :=  cum4{in,  m*,?/;}  =  E{\w\‘^w}.  The  preceding  discussion  pertaining  to  (2-2)  holds  in  this 
case  with  obvious  modifications  provided  we  replace  the  phrase  “nonzero  fourth  cumulants”  in  (ASS) 
with  the  phrase  “nonzero  third  cumulants.”  □ 

Remark  2.  It  has  been  shown  in  [5]  that  under  the  conditions  (AS1)-(AS3)  and  no  noise,  the 
proposed  iterative  approach  yields  a  transfer  function  A{z)  which  is  related  to  J^{z)  via 

A(z)  =  T{z)DAP  (2-5) 

where  D  is  an  M  X  M  “time-shift”  diagonal  matrix,  A  is  an  M  X  M  diagonal  scaling  matrix,  and  P  is 
an  M  X  M  permutation  matrix  □ 

3  Cumulant  Matching 

Define 

Cijki{ri,T2,T3)  :=  C\im4{yi{t),yj{t  Ti),yk{t  +  T2),ytit  +  T3)}.  (3-1) 

Let  Cijki{Ti,T2,T3\6)  denote  the  relevant  variable  parametrized  by  6  where  9  denotes  the  vector  of 
all  unknown  parameters  composed  of  the  elements  of  Fj  for  Z  =  >  L.  Furthermore,  let 

Cijki{Ti,T2,T3)  denote  a  consistent  data-based  estimate  of  Cijki{Ti,T2,T3)  obtained  by  appropriate  sam¬ 
ple  averaging.  It  is  easily  seen  that  for  (1-1), 

AT  L 

Cijkl{'^lyT2)'l'3\9)  —  74u;m/tm(Z)/j„a(t  +  T'l)/fcm(t -f 'r2)/jj„(t  -f  ^3).  (3  2) 

m=l  t=0 

The  cost  function  for  parameter  estimation  via  cumulant  matching  is  given  by 

E  E  E  Elfti«(n,T,,T3)-Ci,«(T.,r2,T3|9)f.  (M) 

Ti=0T2=0T3=0 

During  minimization  of  (3-3),  'jAwm  (see  (3-2)  )  is  kept  fixed  at  its  value  obtained  from  Sec.  2  using 
(2-3).  This  indirectly  fixes  the  scale  ambiguity  (A  in  (2-5)).  The  initial  values  of  9  are  provided  by  the 
solution  of  Sec.  2.  The  choice  of  lags  in  (3-3)  reflects  the  non-redundant  region  of  support  for  cumulants 
of  complex  FIR  processes  [13].  Minimization  of  (3-3)  ca,n  be  performed  using  gradient-based  methods 
(as  in  [9])  and/or  using  software  packages.  For  the  results  presented  in  Sec.  6  we  used  NL2S0L  [14] 
with  the  option  of  numerical  gradients  so  that  explicit  equations  for  gradients  were  not  used. 
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Remark  3.  With  Cijki{Ti,r2,T3)  replaced  with  its  true  value  Cijki(Ti,T2,T3\6o)  in  (3-3),  it  follows  from 
[12]  that  under  the  conditions  (AS1)-(AS4),  minimization  of  (3-3)  (under  L  >  L)  will  yield  a  transfer 
function  A{z)  which  is  related  to  the  true  transfer  function  via  (2-5).  If  Cijki{Tx,T2,r3)  is  a  strongly 
consistent  estimator  of  Cijki{ru'r2,T3\Bo),  then  (global)  minimizer  of  (3-3)  will  yield  with  probability 
one  a  transfer  function  satisfying  (2-5)  [4].  The  problem  is  how  to  ensure  global  minimization  of  (3-3). 
Herein  lies  the  value  of  the  iterative  approach  of  Sec.  2.  Recall  that  the  approach  of  Sec.  2  yields  a 
consistent  IR  estimator  only  under  vanishing  measurement  noise. 

4  Correlation  and  Cumulant  Matching 

Let  e  denote  6  of  Sec.  3  augmented  with  E{\wi{k)\^}  (=  a^.),  i  =  1,2, Define  C'ij(r)  := 
E{yi{t  -1-  r)yj{t)}  and  let  Cij{T\e)  denote  C'ij(r)  parametrized  by  B.  Then 

CijirlB)  =  +  r)f;^{t).  Let  Cij{r)  denote  a  consistent  data-based  estimate  of 

C'ij(r).  The  cost  (3-3)  may  be  augmented  with  correlation  matching  to  devise  the  cost 

n,2  =  +  -  C'u(r|0)l^  (4-1) 

r=l 

where  the  nonnegative  scalar  A  in  (4-1)  is  chosen  to  provide  relative  weighting  between  correlation  and 
cumulant  matching  (as  in  [9]  and  [13]  for  scalar  systems).  Following  [9]  we  choose 
NLLfTi  ^  N  L 

^  =  ^0  ^  ^  X)  £ 

_i,j,k,l=l  n=0  T2=0  T3=0  J  T=0 

where  Xq  >  0.  By  (4-2)  A  is  invariant  to  any  scaling  of  the  data.  For  simulation  results  presented  later, 
we  picked  Aq  =  1.  The  initial  values  of  parts  of  B  that  are  common  to  B  are  selected  as  in  Sec.  3.  The 
initial  values  of  are  obtained  using  (2-3)  in  a  manner  similar  to  that  in  Step  (i)  of  Table  1.  The 
cost  (4-1)  is  useful  when  noise  is  white  Gaussian  (notice  the  exclusion  of  r  =  0  in  (4-1)  ).  It  allows  us 
to  exploit  the  signal  correlations  at  nonzero  lags.  It  can  be  easily  modified  to  incorporate  a  different 
prior  knowledge  such  as  known  noise  correlation  etc. 

5  Blind  Convolutive  Signal  Separation 

As  noted  earlier,  our  objective  is  to  estimate  [J^^^]{z)wi(k)  for  i  =  1,2,  given  {y(A!)}.  The 

solution  of  Sec.  2  provides  a  solution  in  the  form  of  (2-4)  but  it  is  not  necessarily  an  MMSE  solution. 
We  will  now  discuss  other  possible  solutions,  particularly  when  cumulant  (or  related)  matching  is  used 
to  estimate  the  (over-all)  system  transfer  function’.  Let  denote  the  z— th  column  of  Fj.  Let 
and  f[‘^  denote  the  estimates  of  and  f\"\  respectively.  We  wish  to  design  a  linear  MMSE  filter 

{Gi}^o  length  Le  +  lto  estimate  y^^\k  -  d)  given  y(Z)  for  Z  =  fc,  fc  -  1,  •  •  • ,  fc  -  ie  where  d  >  0, 

Z 

y^^\k)  :=  [J^i){z)]w^{k)  =  J]  -  Z),  (5-1) 

z=o 

Le 

y^^\k-d)  :=  Y^Giy{k-i). 
i=o 
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(5-2) 


Both  ie  and  *lie  delay  d  are  “pre-determiired.”  Using  the  orthogonality  priiiciplc  [10],  the  normal 
equations  for  the  MMSE  estimator  simplify  to 

m  =  (5  3) 

t=0  fc=o 

where  Ryy(?7i)  denotes  the  Hermitia-n  operation)  and 

H.-™  =  E  =  E  fH..f(>'«  =  .-]E  {y  W(<,  +  ra  -  <i)y '>)«(».)}  .  (5-4) 

fc=0  fc=o 

Note  that  a  shift  in  the  sequence  leaves  Hd-m  unaffected.  In  order  to  obtain  a  data-based  solution, 

we  simply  replace  all  the  unknowns  by  their  estimates.  Since  there  is  an  inherent  scale  ambiguity  in 
estimating  the  composite  channel  impulse  response  (cf.  (2-5)  ),  we  design  the  equalizer  only  up  to  a 
scale  factor  by  omitting  a^j  from  (5-3).  Denoting  the  so  modified  equalizer  gains  as  (instead  of  G^), 
we  have  the  solution 

Go  Gi  •  •  •  Gi^  I  =  [  H(i  Hti_i  •  •  •  W-d-Lt  ]  '^yy 

where  Hd_,„  :=  ELo  ==  ELi  y(<  +  T  ^record  length  and 

a  1  =  R„„(i  -  i)  =  u’-th  block  of  Hyy-  We  assume  that  noise  is  such  that  the  inverse  in  (5-5) 

exists,  else  a  pseudo-inverse  is  warranted.  The  estimates  above  may  be  obtained  by  any  of  the 
previous  approaches  resulting  in  several  possible  choices.  Under  (AS3)-(AS4),  Ryy(m)  is  a  consistent 
estimator  of  Kyy{m).  Therefore,  if  PP  is  a  consistent  estimator  (to  within  the  ambiguities  specified 
in  (2-5)),  then  asymptotically  we  have  the  desired  MMSE  linear  equalizer  within  a  scale  factor.  This 
holds  true  for  the  approaches  of  Secs.  3  and  4,  but  not  for  that  of  Sec.  2. 

6  Simulation  Examples 

We  now  present  two  simulation  examples.  In  both  the  examples  Fo  is  of  rank  1  <  M  =  2.  Cal¬ 
culation  of  T^yy  (cf.  (5-5))  was  performed  via  singular  value  decomposition  where  all  singular  values 
<  [0.00  lx  (largest  singular  value)]  were  neglected.  This  results  in  a  pseudo-inverse.  The  various  perfoi- 
mance  measures  used  (and  their  computational  details)  are  shown  in  Table  2.  Nonlinear  optimization 

was  done  using  NL2SOL  with  numerical  gradients  [14]. 

Example  1.  Consider  a  2-mput  3-output  MA(2)  system  model  resulting  in  N=Z  and  M=2  in 

(1-1).  Its  3  X  2  transfer  function  J^{z)  was  chosen  as 

’  0.9078  +  0.9078Z-2  0.7471  +  1.1206z-i  +  0.7471z-2 

0.7263z-^  -  0.9078Z-2  o.5603z-i  -  0.5603z-2  .  (6-1) 

0.  0. 

The  last  row  of  (6-1)  is  identicaUy  zero  signifying  that  the  third  ‘sensor’  is  not  receiving  any  information 
signal,  just  noise.  The  inputs  {wj{k)}  {j  =  1,2)  are  mutually  independent,  zero-mean  and  i.i.d.  such 
that  ‘Wi{k)  is  one-sided  exponential  with  variance  0.64,  and  W2{k)  is  binary  taking  values  ±1.0  with 
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probability  0.5  each.  The  uoi.se  at  the  three  sensors  is  mutually  iiidepeiuleut,  zero- mean  wliitc  Gaussian 
such  that  the  noise  power  at  the  first  sensor  is  nine  times  the  noise  power  at  the  other  two  sensors, 
the  latter  being  equal.  Fig.  1  shows  the  “subchannel”  amplitude  spectra  for  the  non-null  subchannels 
where  ij-th  subchannel  refers  to  Fiiiz). 

The  source-iterative  approach  of  Table  1  was  applied  to  inverse  filter  the  data,  to  estimate  the  system 
IR,  a.nd  to  carry  out  signal  separation.  The  length  of  the  inverse  filters  was  15  samples  per  seiisor/output. 
The  average  signal-to-noise  ratio  (SNR  =  Eili  [-E^{|5i(fc)P}/^{ki(^)P}]  ^  component 

of  s(fc)  in  (1-1))  was  taken  to  be  30  dB,  20  dB,  10  dB  and  5  dB,  respectively,  in  two  sets  of  50  Monte 
Carlo  runs  with  varying  record  lengths  of  1500  and  9000  samples  per  run,  respectively.  The  results  of  this 
approach  were  used  to  initialize  minimization  of  (3-3)  with  L  =  3  and  also  of  (4-1)  with  L  =  3  and  Aq  =  1. 
The  channel  estimation  errors  (NMSE)  are  shown  in  Fig.  2.  The  average  SINR  (=(SINRi  -t-  SINR2)/2) 
values  are  shown  in  Figs.  3  and  4.  To  design  the  MMSE  equalizer  (5-5)  we  took  =  14  (as  for  inverse 
filters  in  Table  1)  and  d  =  7  in  all  cases.  The  approach  labeled  “inverse  filter  criterion”  in  Figs.  3 
and  4  uses  (T-2)  for  source  separation;  other  approaches  use  the  MMSE  filter  of  Sec.  5  based  upon 
the  estimated  channel.  It  is  seen  that  the  inverse  filter  criteria  based  approach  of  Sec.  2  coupled  with 
the  MMSE  filter  with  delay  d  =  7  performs  quite  well  for  signal  separation  at  higher  SNR’s.  At  lower 
SNR’s,  cumulant  matching  does  better.  The  benefits  of  introducing  a  delay  in  signal  separation  are 
clear  from  Figs.  3  and  4.  The  upper  bounds  shown  in  Figs.  3  and  4  were  obtained  by  using  the  true 
values  of  in  (5-4)  and  estimated  Tlyy  for  upper  bound  (est.  cor.)  and  true  Tlyy  for  upper  bound 
(true  cor.). 

Example  2.  Consider  a  2-input  3-output  MA(6)  system  model  resulting  in  N=3  and  M=2  in 
(2-1).  Its  3x2  transfer  function  J^{z)  was  chosen  as 


0.7426  -f  0.7426Z-2  0.5678  -I-  0.3407^“^ 

0.4456z-^  +  0.7426Z-2  _o.2385^-^  -  0.5678^-2  -f  0.8176z-2  ^  0.4088^"'^  -f  0.2385^-® 
0.8911^-2  H-  0.5941Z-3  0.6814^-^  -f  0.9085z-2 


(6-2) 


This  example  has  been  taken  from  [5].  The  inputs  {wj{k)}  {j  =  1,2)  are  mutually  independent, 
zero-mean  and  i.i.d.  such  that  wi{k)  takes  values  ±0.8  with  probability  0.5  each,  and  W2{k)  takes 
values  ±1.0  with  probability  0.5  each.  The  noise  at  the  three  sensors  is  mutually  independent,  zero- 
mean  white  Gaussian  such  that  the  noise  power  at  all  the  sensors  is  the  same.  Fig.  5  shows  the 
“subchannel”  amplitude  spectra.  The  simulation  results  are  shown  in  Figs.  6-8  using  the  same  procedure 
and  parameters  as  that  for  Example  1  except  that  now  we  take  A  =  7.  It  is  seen  that  the  inverse  filter 
criteria  based  approach  of  Sec.  2  coupled  with  the  MMSE  filter  performs  quite  well  for  signal  separation 
at  higher  SNR’s. 


7  Conclusions 

The  problem  of  blind  separation  of  independent  linear  non- Gaussian  signals  from  their  linear  convolutive 
mixtures  observed  in  additive  Gaussian  noise  of  unknown  correlation  function  was  considered.  Emphasis 
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was  oil  a  two-step  procedure  where  first  we  estimate  the  system  IR.  (using  one  of  tliree  approaches)  and 
then  design  an  MMSE  filter  with  a  controlled  delay  for  signal  separation  based  upon  the  estimated 
IR.  Two  simulation  examples  were  presented  where  it  was  found  that  the  introduction  of  the  delay 
in  MMSE  filter  design  significantly  improved  the  separation  performance  at  the  expense  of  increased 
computational  complexity. 
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Table  1:  Source-iterative  blind  signal  separation. 


(i)  Maximize  (2-2)  w.r.t.  the  equalizer  C{z)  to  obtain  (2-3).  Let  = 

CUM4(e(fc))  =  CUM4(du;jo(A:)). 

(ii)  Cross-correlate  {e(fc)}  (of  (2-3))  with  the  given  data  (1-1)  and  define  a  pos¬ 
sibly  scaled  and  shifted  estimate  of  fijoir)  as 

^,(r)  :=  E{yiik)e*{k-T)}/E{\eik)\^}.  (T-1) 

Consider  now  the  reconstructed  contribution  of  e{k)  to  the  data  yi{k)  (i  = 
1,2, ■••,M),  denoted  by  yi,jo{k)-. 

(T-2) 

I 

(iii)  Remove  the  above  contribution  from  the  data  to  define  the  outputs  of  a 
MIMO  system  with  N  outputs  and  M  -  I  inputs.  These  are  given  by 

y'i{k)  :=  yi{k)-yi,joik)-  (T-3) 

(iv)  If  M  >  1,  set  Af  <-  M  - 1,  yi{k)  <-  y'i{k),  and  go  back  to  Step  (i),  else  quit. 


FIGURE  CAPTIONS 

Fig.  1.  Example  1.  Amplitude  spectra  20logiol^ii(e^^)l  of  various  subchannels.  [Subchannels  F31  and  F32  are  not 
shown  as  Fsi{€^^)  =  i^32(e^^)  =  0  Vo;.] 

Fig.  2.  Example  1.  Normalized  mean-square  error  (T-6)  in  estimating  channel  matrix  impulse  response  using 
various  approaches,  averaged  over  50  Monte  Carlo  runs.  T  =  record  length. 

Fig.  3.  Example  1.  Average  SINK  (signal-to-interference-and-noise  ratio)  after  blind  signal  separation  using  various 
approaches,  averaged  over  50  Monte  Carlo  runs.  Record  length  T  —  1500. 

Fig.  4.  Example  1.  Average  SINR  (signal-to-interference-and-noise  ratio)  after  blind  signal  separation  using  various 
approaches,  averaged  over  50  Monte  Carlo  runs.  Record  length  T  =  9000. 

Fig.  5.  Example  2.  Amplitude  spectra  20logio|i^ii(e^'^)l  of  various  subchannels. 

Fig.  6.  Example  2.  Normalized  mean-square  error  (T-9)  in  estimating  channel  matrix  impulse  response  using 
various  approaches,  averaged  over  50  Monte  Carlo  runs.  T  ~  record  length. 

Fig.  7.  Example  2.  Average  SINR  (signal-to-interference-and-noise  ratio)  after  blind  signal  separation  using  various 
approaches,  averaged  over  50  Monte  Carlo  runs.  Record  length  T  =  1500. 

Fig.  8.  Example  2.  Average  SINR  (signal-to-interference-and-noise  ratio)  after  blind  signal  separation  using  various 
approaches,  averaged  over  50  Monte  Carlo  runs.  Record  length  T  =  9000. 
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Ta])le  2:  Performance  measures. 


EXAMPLE  1 

Normalization:  First  remove  the  ambiguities  associated  with  the  estimated  channel  IR  (cf. 

(2-5)).  True  model  (6-1)  is  such  that 

3  2 

=  3  for  j=l  and  j  =  2.  (T-4) 

Z=:l  k~0 

Truncate  the  estimated  IR  to  4  samples  a.fter  alignment  with  the  true  IR 
and  then  normalize  it  to  satisfy 
3  2 

E  E  =  3  for  i  =  1  and  j  =  2.  (T-5) 

1=1  A:=— 1 


NMSE: 


SINK: 


The  NMSE  (normalized  mean-square  error)  is  defined  as 


NMSE  = 


eLi  eU  (4'V)  -  fijM)'' 


ELi  E,Ei  E?=-i  iMr)f 


(T-6) 


where  denote  the  estimate  of  the  rj-th  subchannel  IR  for  the  Z-th  Monte 
Carlo  run  and  there  are  Me  runs. 

For  signal  separation  the  performance  measure  was  taken  to  be  the  signal- 
to-interference-and-noise  ratio  (SINR)  per  source  signal,  defined  as 


SINRj 


E{\\y^^Kk)  -  ay^^\A:)||2} 


(T-7) 


where  a  is  that  value  of  the  scalar  a  which  minimizes  E{||y(j3(A:)  - 
ay^^^A:)!!^};  this  is  needed  to  remove  the  scale  ambiguity  in  the  design  of 

(5-3)  -  it  doesn’t  affect  the  SINR. _ _ _ _ 

EXAMPLE  2 


Normalization: 


NMSE: 


SINR: 


The  counterpart  to  (T-5)  is  taken  as 

E  E  =  3  for  ;■  =  1  and  j  =  2, 

1  =  1  ^=  —  1 

The  NMSE  is  modified  as 


NMSE  = 


E-=l  T,U  Sr=-1  -  M^))' 


ELiE|=iEL-i(/if(^)) 


As  for  Example  1. 


(T-8) 


(T-9) 
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AMPLITUDE  SPECTRA 


subchannel  11 


frequency  (X 

subchannel  21 


frequency  (X  1/y 


subchannel  12 


subchannel  22 


frequency  (X  1/p 


iFIg.  1 


Example  1:  channel  estimation  error 
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Example  1:  Signal  separation 
(signal-to-lnterference-and-nolse  ratio;  T=1500) 


Example  1:  Signal  separation 

(slgnal-to-interference-and-nolse  ratio;  T=9000) 
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Ave.  SINR  (dB) 


Example  2:  Signal  separation 

(signal-to-interference-and-noise  ratio;  T=1500) 
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Example  2:  Signal  separation 
(slgnal-to-lnterference-and-noise  ratio;  T=9000) 
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Abstract 

We  investigate  a  suboptimal  approach  to  the  fixed-lag  smoothing  problem  for  Markovian 
switching  systems.  A  fixed-lag  smoothing  algorithm  is  developed  by  applying  the  basic  In¬ 
teracting  Multiple  Model  (IMM)  approach  to  a  state-augmented  system.  The  computational 
load  is  roughly  d  (the  fixed  lag)  times  beyond  that  of  filtering  for  the  original  system.  In 
addition,  an  algorithm  that  approximates  the  “fixed-lag”  mode  probabilities  given  measure¬ 
ments  up  to  current  time  is  proposed.  The  algorithm  is  illustrated  via  a  target  tracking 
simulation  example  where  a  significant  improvement  over  the  filtering  algorithm  is  achieved. 
The  IMM  fixed-lag  smoothing  performance  for  the  given  example  is  comparable  to  that  of  an 
existing  IMM  fixed-interval  smoother.  Compared  to  fixed-interval  smoothers,  the  fixed-lag 
smoothers  can  be  implemented  in  real-time  with  a  small  delay. 


I'pjjjg  was  supported  by  the  Office  of  Naval  Research  under  Grant  N00014-97- 1-0822. 


1  Introduction 


The  system  with  Markovian  switching  coefficients  considered  in  this  paper  is  represented 
by  multiple  linear  models  with  a  given  probability  of  switching  between  the  models.  The 
models  are  one  of  the  n  hypothesized  models,  ,  M"  for  the  system,  and  the  event  that 

model  j  is  in  effect  during  the  sampling  period  ending  at  time  (i.e.,  the  sampling  period 
(4-1,4])  will  be  denoted  by  M^.  The  state  dynamics  and  measurements,  respectively,  are 

modeled  as 

and 

Zk  =  Hlxk  +  wi  (2) 

where  Xk  is  the  system  state  at  tf-  and  of  dimension  TTixy  is  the  measurement  vector  at  4  S'Hd 
of  dimension  rrix,  and  G^_i,  and  are  the  system  matrices  when  model  j  is  in  effect 

over  the  sampling  period  ending  at  ijt-  The  process  noise  v^_-^  and  measurement  noise  are 
mutually  uncorrelated  zero-mean  white  Gaussian  processes  with  covariance  matrices 
and  R^.,  respectively.  At  the  initial  time  to,  the  initial  conditions  for  the  system  state  under 
each  model  j  are  assumed  to  be  Gaussian  random  variables  with  mean  Xq  and  covariance 
PJ.  The  prior  statistics  Xq  and  Pq  are  assumed  known,  as  is  =  F{M^},  the  probability 
of  model  j  at  the  initial  time  to-  The  switching  from  model  to  model  Ml  is  governed 

by  a  finite-state  stationary  Markov  chain  with  transition  probabilities  pij  =  P{Mfc|Mfc_i} 
which  are  assumed  known. 

Motivation  for  considering  system  models  with  switching  coefficients  (also  called  stochas¬ 
tic  hybrid  systems  [11])  stems  from  applicability  of  such  models  to  a  large  class  of  real-world 
problems  such  as  systems  subject  to  failures/repairs,  approximation  of  nonlinear  systems 
with  a  set  of  piecewise  linearized  models,  target  tracking,  etc.  [1],  [4],  [9]-[ll]. 
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This  paper  is  concerned  with  the  problem  of  state  estimation  for  stochastic  hybrid  system 
(l)-(2).  This  problem  has  attracted  considerable  attention  in  the  literature;  see  [1],  [2],  [4]- 
[7],  [10],  'll]  and  references  therein.  Most  of  the  attention  has  been  focused  on  the  filtering 
problem  where  one  is  interested  in  estimating  the  state  at  time  k  given  the  current 
and  past  measurements  =  {z^,  Zk}.  The  optimal  MMSE  (minimum  mean-square 

error)  filter  requires  Kalman  filters  in  parallel  in  order  to  obtain  the  optimal  state  filtered- 
estimate  at  time  k.  Thus  the  optimal  approach  is  not  practical  and  suboptimal  techniques 
have  to  be  considered.  Several  suboptimal  techniques  have  been  investigated  in  the  literature 
[1],  [2],  [11].  The  interacting  multiple  model  (IMM)  algorithm  of  [2]  has  been  found  to  offer 
a  good  compromise  between  the  computational  and  storage  requirements  and  estimation 
accuracy  [10], [11]. 

The  state  smoothing  problem  for  stochastic  hybrid  systems  has  attracted  much  less  at¬ 
tention.  Here  one  is  interested  in  estimating  the  state  Xk  given  past  and  future  data 
{N  >  k).  Fixed- interval  smoothing  problem  (where  record  length  N  is  fixed)  has  been  con¬ 
sidered  in  [6]  and  [7].  Both  [6]  and  [7]  have  used  some  versions  of  the  IMM  algorithm  in  order 
to  implement  fixed-interval  smoothing.  It  is  stated  in  [6,  Sec.  VIII]  that  “...  we  believe  that 
the  time-reversion  and  smoothing  techniques  developed  are  of  interest  to  other  hybrid  state 
estimation  problems  ...  This  leads  immediately  to  the  question  if  and  how  IMM!- smoothing 
can  be  extended  to  fixed-lag  smoothing.”  This  paper  is  concerned  with  the  problem  of  fixed- 
lag  smoothing  using  an  IMM  approach.  In  fixed-lag  smoothing  with  lag  d  {d  >  0)  one  is 
interested  in  estimating  the  state  xj.  given  past  and  part  of  future  data  Zi^  where  d  is  fixed. 
Equivalently,  one  looks  for  estimate  of  Xk-d  given  data  Z^.  For  d  =  0  we  have  the  filtering 
solution. 

For  linear  systems  with  completely  known  parameters,  it  is  well  known  that  fixed-lag 
smoothing  leads  to  an  improvement  in  the  performance  (at  the  cost  of  increased  compu¬ 
tational  complexity)  when  compared  with  the  zero-lag  case  (filtering)  [8].  Indeed,  in  most 
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cases,  a  *sniall*  lag  leads  to  a  performance  almost  as  good  as  that  due  to  fixed-interval 
smoothing  [8].  An  advantage  of  fixed-lag  smoothing  over  fixed-interval  smoothing  is  that 
the  former  can  be  implemented  in  real  time  with  a  small  fixed  time-delay  whereas  the  latter 
has  to  wait  for  the  entire  measurement  record. 

Fixed-lag  smoothing  for  stochastic  hybrid  systems  has  been  investigated  in  [5].  In  [5]  a 
hypothesis-pruning  approach  (called  detection-estimation  [1])  has  been  considered  for  state 
estimation.  Since  the  IMM  algorithm  (which  belongs  to  the  class  of  generalized  pseudo-Bayes 
algorithms  [2])  has  been  found  to  perform  better  than  the  hypothesis-pruning  approaches 
for  the  same  computational  complexity,  it  is  of  some  interest  to  investigate  IMM  algorithm 
based  fixed-lag  smoothing. 

The  paper  is  organized  as  follows.  The  basic  IMM  filtering  algorithm  is  reviewed  in  Sec.  2. 
A  state-augmentation  approach  is  followed  in  Sec.  3  to  derive  an  IMM  fixed-lag  smoothing 
algorithm  via  the  IMM  filtering  algorithm  discussed  in  Sec.  2.  Sec.  3  is  focused  on  state 
estimation.  It  is  of  considerable  interest  to  compute  the  conditional  mode  probabilities 
(given  the  data).  Certain  approximations  are  suggested  in  Sec.  4  to  compute 
these  probabilities.  A  discussion  of  the  computational  requirements  of  the  proposed  fixed-lag 
IMM  smoother  as  compared  with  that  of  the  IMM  filter  is  provided  in  Sec.  5.  In  Sec.  6  we 
illustrate  the  proposed  approach  via  a  target  tracking  simulation  example  taken  from  [7]. 
When  compared  with  the  results  of  [7],  it  shows  that  a  delay  of  just  a  few  samples  leads 
to  a  performance  comparable  to  that  of  the  fixed- interval  smoothing  of  [7].  Finally,  some 
concluding  remarks  are  provided  in  Sec.  7 . 

2  Basic  IMM  Algorithm 

The  IMM  algorithm  [2]  for  state  filtering  is  based  on  running  n  “mode-matched”  state  estima¬ 
tion  filters  which  exchange  information  (interact)  at  each  sampling  instant.  It  assumes  that 
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the  conditional  probability  density  f{xk\Ml,  Z^)  is  Gaussian  with  mean  —  E{xk\Ml, 
and  covariance  =  E{[xk-xi^k]i^k-xi^k]'\^ky  where  the  symbol '  denotes  the  trans¬ 
pose  operation.  In  reality,  however,  the  density  f{xk\Ml,  )  is  a  Gaussian  sum  (containing 
n*^  terms). 

As  the  algorithm  is  well- explained  in  [1]  (see  also  [10]  and  [11]),  we  will  only  briefly 
outline  below  the  basic  steps  in  “one  cycle”  (i.e.  processing  needed  to  update  for  a  new 
measurement)  of  the  IMM  filtering  algorithm.  We  follow  Table  I  of  [10]  for  most  part. 

Initialization:  Given  the  associated  covariance  matrix  Pk-i\k-i  condi¬ 

tional  mode  probability  ;=  P(M^_il.Zi"^)  for  each  j  €  Mn  :=  {1,2,  •  •  •  ,n}.  For  =  1, 
we  take  x^^q  =  P^\o  =  Pq  and  /xj  =  P(M^). 

Interaction  (Vj  G  Mn)- 

predicted  mode  probability: 

lii-  :=  P{M>iZf->}  =  (3) 

1=1 

mixing  probability: 

r=  P{MU\Mi,  Zf-'}  =  (^) 


mixed  estimate: 

:=  £[xi_.|M>,Z,‘-'|  =  Exi-ili.-./'''' 

t=l 

covariance  of  the  mixed  estimate: 

zf-'} 


i=l 


Prediction  and  filtering  (Vj  G  Mn)- 

^fclfc-l  “  Z^  =  Pjfc-l®fc-l|fc-l 


(5) 

(6) 

(7) 
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(8) 


=  E{[xk  -  -  4\k-i]'\ML  Zt'} 

measurement  residual: 

vl:=  Zk  — 
residual  covariance: 

si  :=  E{uIpI  }  =  +  Rj^ 

filter  gain: 

wi  = 

filtered  state  estimate: 

xi|,  =  E{x,\Ml,  Zf}  =  xiu_.  +  Wl4 

covariance  of  the  filtered  state  estimate: 

=  E{[xk  -  xi|fc][sfc  -  4\k]Wk,  -^1} 

=  pl\k-.  - 

likelihood  function: 

H  =  ^f{4\^,si)  :=  l27r5^|-^/=*exp  [-^4' {Si)~'' 4 

mode  probability: 

4  =  P(M’|Zf)  =  f 

L,i=i 

Combination: 

2fc|fc  =  E{xk\Z^]  = 

t=i 

Pt\l,  =  E{{xi,  -  xmllxib  -  $nfcl'|Zf } 


(9) 

(10) 

(11) 

(12) 

(13) 

(14) 

(16) 

(16) 

(17) 
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3  IMM  Fixed-Lag  Smoothing  Algorithm 

For  some  fixed-lag  d  and  all  k  >  d  ,  our  objective  is  to  find  the  fixed-lag  smoothing  state 
estimate 


Sk-d\k  =  E[xk-d\Zi] 

and  the  associated  error  covariance  matrix 


(18) 


Pk-d\k  -  E{  [xk-d  -  xk-d\k][xk-d  -  Xk-d\k]'\Zi  }.  (19) 

When  d  =  0,  we  have  the  filtered  state  estimate  as  discussed  in  Sec.  2.  We  will  follow  a 
state-augmentation  approach  to  define  a  larger  dynamical  stochastic  hybrid  system  and  then 
apply  the  results  of  Sec.  2  to  this  augmented  system.  State  augmentation  for  derivation  of 
fixed-lag  smoothing  estimators  has  been  used  before  for  “non-switching”  linear  systems  [8, 
Sec.  7.3]  but  not  for  stochastic  hybrid  systems. 


Augment  the  state  variable  Xk  to  Xk  as 


x'k 


X 


XI 


)  Xk  J 


(20) 


where 

=  Xk,  x^k^  =  Xk-i,  •  •  • ,  x^^  =  Xk-d. 

Suppose  that  for  the  augmented  system,  we  obtain  the  filtered  state  estimate 

Xk\k  :=  E{xk\Z^} 

and  the  associated  covariance  matrix 

Pfclfc  E{[xk  -  Xk\k][xk  -  Xk\k]'\Zi}. 

It  therefore  follows  that 

^ili  ~  Xk-i\k 


(21) 


(22) 


(23) 


(24) 
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and 


k\k 


- 

- 

i,  where 

5(0.0) 

“^klk 

p{P,d) 

■  ^k\k 

5(1.0) 

5(1*1) 

^k\k 

5(1."^) 

■^k\k 

6(^.0) 

'^klk 

p('^’l) 

^k\k 

- 1 

. 

(25) 


Pk\k 


Note  that  Pk\k  is  symmetric,  i.e.,  . 

Using  (1),  (2),  (20)  and  (21),  the  augmented  system  can  be  written  as  follows: 


1 

1 

M 

O 

...  0 

0 

1 

1 

1 _ 

I  0 

0 

0 

5(1) 

^k-1 

0 

^k 

= 

0  I 

0 

0 

®fc-l 

+ 

0 

^k 

0  0 

...  I 

0 

^k-1 

0 

(26) 


and 


Zk=^[Hi  0  •••  0  0] 


X 


X 


(1) 


=(2) 


X 


X 


+  W- 


(27) 


The  above  augmented  state  and  measurement  equations  may  be  written  more  compactly  as 

ik  =  +  oLA-i 

and 

Zk  =  HlZk  +  wi  (29) 
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where  the  system  matrices  and  Hi  are  defined  in  an  obvious  manner. 

We  now  apply  the  basic  IMM  algorithm  to  the  augmented  system.  Unlike  Sec.  2  where 
f{xk\Ml,  Z^)  is  approximated  by  a  Gaussian  random  vector,  now  we  approximate  Z^) 

—  *  *  ’ )  >  ^i')  ^  Gaussian  random  vector.  Clearly  the  latter  approxima¬ 

tion  implies  the  former  whereas  the  converse  is  not  true  in  general.  The  resulting  algorithm 
is  as  follows: 

Interaction  (Vj  G  Ad„): 

predicted  mode  probability: 

i=l 

mixing  probability: 

/I'li  :=  Zf-'}  =  TnA-Jl^-  <“) 


mixed  estimate: 

=  E[xk-\\Ml,  ZE'^]  = 

i—\ 

covariance  of  the  mixed  estimate: 


(32) 


zf-'} 

=  E  { +  iCii/b-i  -  1^'’-  (33) 

t=l  ^ 

We  will  only  be  interested  in  .Pfc-il’fc-i’  diagonal  sub-matrices  of  Pki-i\k-iJ 
pOi(o,t)  for  i  =  0  1  •  •  •  d  -  1,  in  order  to  complete  the  filtering  process  in  the  sequel  [8, 

k-l\k-l^  3  J  5  7 

Sec.  7.3];  see  also  (37)  later  in  this  paper.  Thus  in  (33)  we  only  need  to  compute  PkL-i\k-i 
and  Pfciq’fc!.!  for  i  =  0,  •  •  • ,  d  —  1. 

Prediction  and  filtering  (Vj  G  Ad„): 
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Using  (7),  (8)  and  (26),  it  follows  that 


5i(o)  pr_(o),,„-  T^i  ?0j(o) 


^fc|fc 


Zf"'}  =  for  i  =  1.  "  ,  <i 

=  ELpSi^^FiJ  +  gUqUgU 

for i  =  l, ■■•,<« 

for  i  =  1.  ■  ■  ■ .  <i. 

Using  (9),  (10)  and  (27),  it  follows  that 
measurement  residual; 


and 


K  ■=  -  HkXk\k-l 


residual  covariance: 


si  ■■=  E{44'}  =  HiP$fiHP  +  Rl 

filter  gain:  Using  (11)  it  follows  that 
Vrf >  =  for  i  =  0.  ■  ■  • ,  <i. 

Using  (12)  and  (13)  we  have 


=  +  wi^^4 


c:j{i)  _  ^j(i) 
®fc|fe  —  ^k\k 


=  PivA  -  for  i  =  0, 1,  ■  ■  •  ,<i 


k\k 


pS"’  =  -  i^i’°’siff'i<'>'  for i  =  1,- •  • 


kjk 


(34) 

(35) 

(36) 

(37) 

(38) 


(39) 

(40) 

(41) 

(42) 

(43) 

(44) 
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likelihood  function: 


A{  =  A/'(i/i;0,5^)  =  |27rS'^| 


mode  probability: 


4  =  p(Mi\z!!)  = 


Combination: 


^k\k  E[xk\Z^]  — 


Pfclfc  =  E{[Xk  -  Xk\k][Xk  -  ®fc|A:]'l^l  } 


=  £  [Pk\k  +  [®l|fc  -  ®fclfc]S|fc  -  ®fc|fe]'}  /^fc 


In  (48),  as  before,  we  need  not  compute  all  the  elements  as  we  are  only  interested  in  Pk\k 
and  Pk^k^  for  z  =  0,  •  •  •  ,  d. 

Finally  we  obtain  the  smoothed  state  estimates  (in  addition  to  the  current  state  estimate). 

~(i)  f49') 

and  the  associated  error  covariance  matrix 

p  _  p(*-*)  (50) 

Pk-i\k  -  ^k\k  ^  ’ 

for  i  =  0,  •  •  • ,  d. 

Initialization  for  the  Augmented  System:  In  order  to  let  the  augmented  system  have 
the  same  dynamics  as  the  original  system,  we  set 

$0  =  [x;  0  ...  or  (51) 


which  implies  that 


=  ^0  and  Xo|o  =0  for  i  0 
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and 


=  Pi  and  =  0  for  {k,  1)  +  (0, 0).  (53) 

Recall  that  for  the  original  system  we  have 
xo\o  =  Xo  and  Po^o  =  Po- 


4  Approximation  of  Mode  Probabilities 


In  Sec.  3,  we  obtained  only  the  mode  probability  at  tk  in  (46).  In  keeping  with  fixed-lag 
smoothing,  we  would  also  like  to  obtain  the  mode  probabilities  P{Ml_i\Zi)  for  i  =  1,  •  •  • , d. 
Following  some  of  the  approximations  made  in  [7],  we  make  an  approximation  by  replacing 
with  {xk-i\k,Zi~^},  i.e., 

P(MLI^i)  ~  P{MUxk-nk,Zt^) 

=  kf{Sk-n^\ML,Zt‘)nMLAZt')]  (55) 

c 

where  c  in  (55)  is  a  normalization  constant  given  by 

c  =  Z^-‘)P(MUZi-‘)].  (56) 

j=:l 

In  a  manner  similar  to  that  in  Eqns.  (75)  and  (84)  of  [7],  we  have  replaced  the  measurements 
Zk-i+i  in  (55)  with  the  smoothed  state  estimate  Xk-i\k-  We  also  make  the  approximations 

f{xk-i\k\Mk-i>  >  *)  ~  /(®fc-tlMfc-t>  >  •^1  ) 

«  ^^{xk-^;xi_^k-i,Pk-i\k-i) 


where 


M{x-,y,P):=\27rP\  ^^^exp 


-^i^-yyp 


(58) 
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Therefore,  (55)  can  be  rewritten  as 


where  normalization  constant  c'  is  given  by 


j  I  ryk  —  i> 


We  note  that  in  (59),  P(M^’_,-|Zi~‘)  is  the  “old”  mode  probability  based  on  filtered  state 
estimates  when  measurements  only  up  to  tk-i  are  available.  It  is  the  likelihood  function  that 
M{xk-i\k\^k-i\k-i^Pi-i\k-i)  utilizes  the  new  information  contained  in  measurements  after 
time  tk-i- 


It  should  be  noted  that  (58)  can  not  be  applied  if  jP]  =  0.  As  will  be  seen  in  Sec.  6  in 
a  target  tracking  context,  for  constant  velocity  models  with  accleration  as  a  system  state, 
such  a  situation  can  arise.  Our  solution  (discussed  in  more  detail  in  Sec.  6)  is  to  use  a  state 
of  reduced  order  such  that  the  corresponding  covariance  matrix  is  of  full  rank. 


5  Analysis  of  Computational  Load 

Here  we  carry  out  a  “crude”  comparison  of  the  IMM  smoothing  algorithm  of  Secs.  3  and  4 
with  the  original  non-augmented  system  IMM  filtering  algorithm  (see  Sec.  2)  regarding  their 
relative  computational  requirements 

During  interaction,  comparing  (3)  and  (4)  with  (30)  and  (31),  respectively,  it  is  seen 
that  the  computational  loads  are  identical.  Comparing  (5)  and  (6)  with  (32)  and  (33), 
respectively,  it  is  seen  that  the  latter  needs  more  multiplications;  the  computational  load 
of  (32)  is  d  times  that  of  (5)  and  the  computational  load  of  (33)  is  originally  times  that 
of  (6),  but  since  we  only  need  the  diagonal  sub-matrices  and  the  sub-matrices  in  the  first 
column  of  it  turns  out  that  the  computational  load  of  (33)  is  reduced  to  about  2d 

times  that  of  (6). 
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During  prediction  and  filtering,  comparing  (7)  and  (8)  with  (34)-(38),  we  first  note  that 
(35)  and  (37)  do  not  need  any  additional  computations  at  all.  Furthermore,  (34)  and  (36) 
have  the  same  computational  load  as  that  for  (7)  and  (8).  The  only  computational  increase 
here  for  the  smoothing  algorithm  is  in  (38).  There  is  no  relative  computational  load  increase 
in  computing  measurement  residual  (see  (9)  and  (39)),  residual  covariance  (see  (10)  and 
(40)),  likelihood  function  (see  (14)  and  (45))  and  mode  probability  (see  (15)  and  (46)). 
There  is  indeed  some  computational  load  increase  when  computing  the  filter  gain  (see  (11) 
and  (41)),  but  this  only  involves  matrix  multiplication  (and  not  other  complex  operations 
such  as  computing  inverse  of  a  matrix).  The  same  is  true  for  state  estimate  and  its  error 
covariance  matrix  for  each  mode  (see  (12)-(13)  and  (42)-(44)). 

During  combination,  the  relative  computational  load  increases  are  similar  to  that  for  (32) 
and  (33),  i.e.,  of  the  order  of  d  times  that  of  (17). 

The  mode  probability  calculations  in  (59)  are  of  the  same  order  as  that  for 

(46)  and  (15). 

Overall  we  run  n  filters /smoothers  in  parallel  whereas  [7]  runs  smoothers  in  parallel 
and  [6]  runs  n  smoothers  in  parallel.  Unlike  [6]  and  [7],  we  do  not  need  a  backward-time 
model  (and  its  “initial”  conditions  at  final  time).  More  significantly,  we  run  the  smoother 
(beyond  the  filtering  part)  only  for  d  samples  whereas  [6]  and  [7]  run  it  over  the  entire 
measurement  record. 
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6  Simulation  Example 


A  target  tracking  example  is  provided  to  compare  the  performance  of  the  IMM  fixed-lag 
smoothing  algorithm  and  the  (forward-time)  IMM  filtering  algorithm.  This  example  has 
been  taken  from  [7].  The  following  scenario  is  considered.  A  target  is  moving  in  a  two- 
dimensional  plane  with  a  constant  speed  and  performing  two  constant-speed  Zg  maneuvers. 
The  first  maneuver  occurs  from  10  to  22  s,  and  the  second  one  from  26  to  38  s.  The  true 
position,  velocity  and  acceleration  of  the  target  are  shown  in  Fig.  1.  Position  measurements 
(range  and  bearing)  of  the  target  are  sampled  with  period  T  =  1  s.  The  measurements 
contain  zero-mean  Gaussian  errors  with  standard  deviations  of  15  m  in  range  and  0.002  rad 
in  bearing. 


(60) 


The  state  of  the  target  is  defined  as 

x  =  [i  i  i  r]  g  g] 

with  i  and  t]  denoting  the  orthogonal  (Cartesian)  coordinates  of  the  horizontal  plane  and 
I  :=  The  discrete-time  multiple  model  set  consists  of  two  models  (n  =  2  in  (l)-(2))  as: 

’  dt 

1)  The  Constant  Velocity  (CV)  model  (j  =  1  in  (l)-(2))  with  a  piecewise-constant  accel¬ 
eration  process  noise  and  the  noise  covariance  Qcv  =  0.25/2  m^/s^  where  h  is  the  2x2 
identity  matrix.  The  corresponding  system  matrices  are  [3]: 


Fl  =  F^  = 


1 

T 

0 

0 

0 

0 

lrp2 

0 

0 

1 

0 

0 

0 

0 

T 

0 

0 

0 

0 

0 

0 

0 

0 

to 

0 

0 

0 

1 

T 

0 

0 

0 

0 

0 

0 

1 

0 

0 

T 

0 

0 

0 

0 

0 

0 

0 

0 

(61) 
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2)  The  Constant  Acceleration  (CA)  model  with  a  piecewise-constant  jerk  process  noise 
and  the  noise  covariance  Qca  =  9/2  mVs®.  The  corresponding  system  matrices  are: 


=  Fca  = 


1 

T 

1  rp2 

0 

0 

0 

1  7^3 

0 

0 

1 

T 

0 

0 

0 

IT2 

0 

0 

0 

1 

0 

0 

0 

,  Gl  =  Gca  = 

T 

0 

0 

0 

0 

1 

T 

1  rp2 

0 

1  rp3 
6-^ 

0 

0 

0 

0 

1 

r 

0 

0 

0 

0 

0 

0 

1 

0 

T 

(62) 


Pn 

Pl2 

0.95 

0.05 

P21 

P22  _ 

0.10 

0.90 

The  initial  model  probabilities  are  fil  =  0.9  and  /xg  =  0.1  (as  in  [7]).  The  model-switching 
probability  matrix  is  given  by 

1 

(63) 

The  initial  estimates  of  the  velocity  and  acceleration  are  arbitrarily  set  to  zero,  with  variances 
of  10®m^/s^  and  10®m^/s'*,  respectively,  as  in  [7]. 

Let  X(i)fc  denote  the  i-th  element  of  vector  Xk  which  is  a  6-vector  (cf.  (60)).  The  mea¬ 
surements  are  given  by 


Zk  = 


Rl  =  Ri  = 


(65) 


+  ®(4)fc  ^  (64) 

[  arctan  (x(4)fc/ X(i)fc) 

where  the  measurement  equation  is  the  same  for  the  two  models  in  the  model  set.  The 
covariance  matrix  of  the  2- vector  Wk  is  given  by  (as  in  [7]) 

(15  m)^  0 

0  (0.002  rad)^ 

A  first-order  Taylor  series  expansion  around  x^k-i  linearize  (64),  i.e.  a  first-order 

extended  Kalman  filter  (EKF)  was  used  to  apply  the  various  estimation  algorithms. 

The  fixed-lag  IMM  smoothing  algorithm  was  implemented  using  a  first-order  EKF  over 
100  Monte  Carlo  runs.  Note  that  due  to  the  fact  that  all  the  elements  in  the  3rd  and  6th 
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rows  of  Gcv  are  zero,  the  determinant  of  Pk-i\k-i  2;ero  where  model  j  =  1  is  the  CV  model. 
Therefore  in  this  example  we  can  not  apply  (57)  directly.  Instead  we  use  a  state  of  reduced 
order  with  :==  [^  ^  rj  ??]  so  that  we  can  evaluate  (57).  The  order  of  the  corresponding 
covariance  matrix  Pk—i\k—i  (^'^)  reduced  to  match  the  state  accordingly.  Note  we 
need  to  use  Xr  for  both  the  CV  model  and  the  CA  model,  in  order  to  calculate  the  likelihood 
function  defined  in  (57).  This  ‘adaptation’  is  reasonable  because  we  should  weight  the 
probabilities  of  all  models  based  on  the  same  set  of  states. 

Fig.  2  displays  the  average  root-mean-square  errors  (RMSE)  in  position,  velocity  and 
acceleration,  and  the  average  CV  model  probabilities.  The  legend  used  in  Fig.  2  is  self- 
explanatory:  0  stands  for  the  case  of  (forward-time)  IMM  filtering  (no  smoothing,  or  fixed- 
lag  d  =  0)  and  1,  2  and  3  stand  for  the  case  of  the  proposed  fixed-lag  IMM  smoothing 
algorithm  with  fixed-lags  d  =1,  2  and  3,  respectively.  The  thick  solid  line  in  Fig.  2(d)  stands 
for  the  normalized  magnitude  of  the  acceleration.  It  can  be  seen  from  Fig.  2(a)-(c)  that 
the  various  RMSE’s  using  the  proposed  smoothing  algorithm  decrease  with  increasing  lag 
d.  Comparing  Fig.  2  with  Fig.  4  in  [7],  it  is  seen  that  the  performance  of  our  algorithm  for 
d  =  3  almost  approaches  that  of  fixed-interval  smoothing  algorithm  presented  in  [7].  When 
d  =  1,  the  most  significant  reduction  in  RMSE  occurs  where  the  peak  RMSE  in  position  is 
reduced  from  145  m  (no  smoothing)  to  65  m  while  the  peak  RMSE  in  velocity  is  reduced 
from  102  m/s  (no  smoothing)  to  78  m/s.  Besides,  significant  reductions  in  RMSE  are  also 
achieved  over  the  entire  tracking  interval. 

Fig.  2(d)  displays  the  average  CV  model  probability  for  d  =  0,  •  •  •, ,  3,  as  well  as  the  nor¬ 
malized  magnitude  of  the  true  acceleration.  Clearly  maneuvers  are  detected  by  the  proposed 
smoothing  algorithm  (d  >  1)  more  quickly  compared  to  the  forward-time  IMM  filtering  al¬ 
gorithm.  Except  for  a  short  period  following  model  switching,  the  probability  of  one  of  the 
models  in  the  model  set  is  always  quite  close  to  one  and  it  reflects  the  true  motion  status 
of  the  target,  whereas  the  mode  probability  obtained  via  the  forward-time  IMM  filtering 
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algorithin  is  somctirnes  somewhat  uncertain  even  after  the  transient  stage. 

Overall  it  is  seen  that  even  a  small  lag  can  lead  to  a  much  better  state  estimation 
performance. 

7  Conclusions 

We  investigated  a  suboptimal  approach  to  the  fixed-lag  smoothing  problem  for  Markovian 
switching  systems.  A  fixed-lag  smoothing  algorithm  was  developed  based  on  the  concept  of 
interacting  multiple  models  [2].  The  filtering  and  smoothing  for  the  original  system  were 
integrated  by  introducing  a  state-augmented  system  whose  current  state  vector  consists  of 
the  current  and  delayed  states  (down  to  a  fixed-lag  d)  of  the  original  system.  The  fixed-lag 
mode  probabilities  given  measurements  up  to  the  current  time  were  approximated  using  a 
simple  but  effective  method. 

Simulation  results  for  estimating  the  trajectory  of  a  maneuvering  target  were  presented 
to  compare  the  performances  of  the  proposed  smoothing  algorithm  and  the  forward-time 
IMM  filtering  algorithm  using  an  example  from  [7].  The  performance  of  the  fixed-lag  IMM 
smoother  was  significantly  better  than  that  of  the  IMM  filter.  The  performance  of  the 
proposed  fixed-lag  IMM  smoothing  algorithm  quickly  approaches  that  achieved  by  the  fixed- 
interval  smoothing  algorithm  of  [7]  with  increasing  lag;  recall  that  we  have  used  the  example 
of  [7]  in  Sec.  6.  Compared  to  fixed-interval  smoothers,  the  fixed-lag  smoothers  can  be 
implemented  in  real-time  with  a  small  delay. 

Overall  we  run  n  filters /smoothers  in  parallel  where  n  is  the  number  of  models  in  the 
model  set.  The  total  computational  load  is  roughly  {d  +  1)  times  that  required  by  the 
forward-time  IMM  filtering  for  the  original  system.  Given  measurements  up  to  time  tk,  in 
addition  to  the  smoothed  state  estimate  at  time  tk-d  we  also  obtain  the  smoothed  state 
estimates  from  tk-d+i  through  tk-i  and  the  current  state  estimate  at  tk  without  any  extra 
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effort. 
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Fig.  2.  Comparison  of  the  filter  and  smoothers  for  various  fixed-lags  d  =  0, 1,2,3  ; 


(a)  RMSE  in  position,  (b)  RMSE  in  velocity,  (c)  RMSE  in  acceleration,  (d)  CV  model 
probability.  Solid:  lag  d=0;  dashed;  lag  d=l;  dot-dashed:  lag  d=2;  dotted:  lag  d— 3. 
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Abstract 

The  problem  of  blind  equalization  of  SIMO  (single-input  multiple-output)  communications  chan¬ 
nels  is  considered  using  only  the  second-order  statistics  of  the  data.  Such  models  arise  when  a  single 
receiver  data  is  fractionally  sampled  (assuming  that  there  is  excess  bandwidth),  or  when  an  antenna 
array  is  used  with  or  without  fractional  sampling.  We  focus  on  direct  design  of  finite-length  MMSE 
(minimum  mean-square  error)  blind  equalizers.  Unlike  the  past  work  on  this  problem,  we  allow 
infinite  impulse  response  (HR)  channels.  Our  approaches  also  work  when  the  subchannel  transfer 
functions  have  common  zeros  so  long  as  the  common  zeros  are  minimum-phase  zeros.  Illustrative 
simulation  examples  are  provided. 


^This  work  was  supported  by  the  National  Science  Foundation  under  Grant  MIP-9312559  and  by  the  the  Office 
of  Naval  Research  under  Grant  N00014:-97-10822. 


1  Introduction 


Consider  a  discrete- time  SIMO  (single-input  multiple-output)  system  with  N  outputs  and  one  input. 
The  i-th  component  of  the  output  at  time  k  is  given  by 

yi{k)  =  J^i{z)w{k)  +  ni{k) ,  i  =  1,2,- ■  ■  ,N,  (M) 

=>  y{k)  =  T{z)w{k)  -1-  n(A:)  =  s{k)  +  n{k),  (1-2) 

where  y(fc)  =  [yi(^) 'y2(^)  •  ■  ■  similarly  for  s[k)  and  n{k),  and  z  is  the  2^— transform 

variable  as  well  as  the  backward-shift  operator  (i.e.,  z~'^w{k)  =  w{k-  1),  etc.).  The  sequence  w{k) 
is  the  (single)  input  at  sampling  time  k,  yi{k)  is  the  i-th  noisy  output,  Si{k)  is  the  i-th  noise-free 
output,  ni{k)  is  the  additive  measurement  noise,  and 

1=0 

is  the  scalar  transfer  function  with  w[k)  as  the  input  and  yi{k)  as  the  output;  it  represents  the 
i-th  subchannel.  We  allow  all  of  the  above  variables  to  be  complex- valued.  The  overall  transfer 
function  is  denoted  by  the  JV  x  1  T{z)  with  its  i-th  element  as  J^i{z).  We  have 

^(2)  =  (1-4) 

Such  models  arise  in  several  useful  baseband-equivalent  digital  communications  and  other  appli¬ 
cations.  A  case  of  some  interest  is  that  of  fractionally-spaced  samples  of  a  single  baseband  received 
signal  leading  to  a  SIMO  model  [l],[4l,[8].  Alternatively,  a  similar  model  can  be  derived  when  we 
have  a  single  signal  impinging  upon  an  antenna  array  with  N  elements  [5].  A  similar  model  arises 
if  we  have  an  antenna  array  coupled  with  fractional  sampling  at  each  array-element  [5]. 

In  these  applications  one  of  the  objectives  is  to  recover  the  inputs  w{k)  given  the  noisy  mea¬ 
surements  but  not  given  the  knowledge  of  the  system  transfer  function.  Recently  there  has  been 
much  interest  in  solving  this  problem  using  only  (or  at  least,  to  the  maximum  extent  possible) 
the  second-order  statistics  (SOS)  of  the  data  (see  [1],  [3]-[5],  [8]-[14]  and  references  therein).  The 
solution  is  closely  tied  to  existence  of  an  FIR  (finite  impulse  response)  inverse  to  the  system  transfer 
function  [1],  [3]-[5],  [8]-[14].  An  overwhelming  number  of  papers  (see  [4],[5],[9]-[12]  and  references 
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therein)  have  concentrated  on  a  two-step  procedure:  first  estimate  the  channel  impulse  response 
(IR)  and  then  design  an  equalizer  using  the  estimated  channel.  A  fundamental  restriction  in  these 
works  is  that  the  channel  is  FIR  with  no  common  zeros  among  the  various  subchannels.  A  few  (see 
[l]and  [13],  e.g.)  have  proposed  direct  design  of  the  equalizer  bypassing  channel  estimation.  Still 
they  assume  FIR  channels  with  no  common  zeros. 

In  this  paper  we  allow  HR  (infinite  impulse  response)  channels  (which  are  finitely  parametrized). 
We  will  also  allow  common  zeros  so  long  as  they  are  minimum-phase  (i.e.,  they  lie  inside  the  unit 
circle).  Finally,  in  the  presence  of  nonminimum-phase  common  zeros,  our  proposed  approach 
equalizes  the  spectrally-equivalent  minimum-phase  counterpart  of  it  does  not  “fall  apart” 

unlike  quite  a  few  existing  approaches.  We  should  note  that  our  proposed  approach  is  inspired  by 
[1].  Unlike  [1]  our  approach  applies  to  antenna  arrays  since  we  do  not  require  that  /i(0)  7^  0  but 
/,(0)  =  0  for  i  =  2, 3, as  is  required  by  [1].  This  requirement  of  [1]  is  not  restrictive  for 
single- receiver  causal  systems  with  fractional  sampling  as  one  can  always  achieve  this  by  shifting, 
i.e.  “re-grouping”  of  fractional  samples  per  symbol.  It  does  demand  symbol  synchronization  so 
that  fractional  samples  belonging  to  a  given  symbol  are  known  thereby  allowing  for  shifting  or 
re-grouping  to  achieve  the  aforementioned  requirement.  In  this  paper  we  don’t  require  such  a 
synchronization;  only  the  baud  rate  ought  to  be  known. 

Note  that  the  prediction  error  methods  of  [8],  [9]  and  [14]  apply  to  the  problem  under  considera¬ 
tion  with  some  straightforward  extensions/modifications  (as  we  discuss  in  Sec.  3.3.3).  Interestingly, 
[8],  [9]  and  [14]  derive  their  results  under  the  assumption  of  FIR  channels  with  no  common  zeros. 

Three  approaches  are  proposed  in  this  paper  for  designing  a  blind  MMSE  (minimum  mean- 
square  error)  linear  equalizer  of  a  specified  length  and  delay.  The  approaches  do  not  require  the 
knowledge  of  the  underlying  system  model  orders  or  IR  length.  Algorithms  I  and  II  are  inspired  by 
[1]  whereas  Algorithm  III  is  a  straightforward  extension  of  [9]  and  [14].  Algorithm  II  also  exploits 
some  results  from  [9]  and  [14]  (see  Remark  4  in  Sec.  3.1). 

The  paper  is  organized  as  follows.  Precise  model  assumptions  and  some  background  results  used 
later  in  the  paper  are  stated  and  developed  in  Sec.  2.  IIR  channels  with  no  common  subchannel 
zeros  are  considered  in  Sec.  3  where  the  three  proposed  algorithms  of  this  paper  are  developed. 
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Under  the  assumptions  of  Sec.  3,  finite  length  inverses  and  zero-forcing  equalizers  exist.  In  Sec.  4 
we  allow  common  subchannel  zeros.  Here  ideally  we  need  infinite  length  inverses  and  zero-forcing 
equalizers.  Two  computer  simulation  examples  involving  a  4-QAM  signal  are  presented  in  Sec.  5 
to  illustrate  and  compare  the  performances  of  the  proposed  approaches. 


2  Model  Assumptions  and  Preliminaries 

In  this  section  we  consider  precise  model  assumptions  and  some  background  results  used  later  in 
the  paper.  The  material  in  Sec.  2.1  is  useful  in  developing  Algorithm  I  (see  Sec.  3)  whereas  the 
material  in  Sec.  2.2  is  useful  in  developing  Algorithm  III.  Algorithm  II  exploits  both  Secs.  2.1  and 
2.2.  Lemma  2  is  needed  to  estimate  the  noise  variance.  Lemma  1  is  a  straightforward  extension  of 
the  results  of  [9]  and  [14]. 

2.1  FIR  Inverses 

Let  J^{z)  =  A-^iz)B(z)  where  A(z)  =  1  4-  E?=i  is  1  x  1  and  B(z)  =  is  N  X  1. 

Assume  the  following; 

(HI)  JV  >  1. 

(H2)  Rank{H(z)}  =  1  Vz  including  z  =  oo  but  excluding  z  =  0,  i.e.,  B{z)  is  irreducible  [7, 
Sec.  6.3]. 

(H3)  A{z)  7^  0  for  |z|  >  1. 

Assumption  (H2)  is  equivalent  to  stating  that  the  various  subchannels  Fi{z)  have  no  common  zeros. 
It  has  been  shown  in  [6]  (using  some  results  from  [2])  that  under  (H1)-(H3)  there  exists  a  finite 
degree  left-inverse  (not  necessarily  unique)  of  !F{z): 

g(z)F(z)  =  1 

where  Q(z)  is  1  X  N  given  by 

g(z)  =  ^G|z“^  for  any  Le  >  na  +  nt  -  1. 

1=0 
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Remark  1:  The  left-inverse  g{z)  of  J^{z)  consists  of  two  parts:  g{z)  =  gB{z)A{z)  where 
gB{z)B{z)  =  1  so  that  g{z)J^{z)  =  gB{z)Aiz)A-^{z)B{z)  =  gB{z)B{z)  =1.  Finite  length  left- 
inverses  of  FIR  SIMO  channels  have  been  subject  of  intense  research  activities  [4]-[6],[8]-[13].  Left- 
inverses  to  MIMO  IIR/FIR  channels  have  been  considered  in  [6].  It  appears  that  the  results  of  [6] 
pertaining  to  MIMO  models  are  the  sharpest  to  date.  Finally,  it  is  important  to  stress  that  [4], 
[5]  and  [8]-[13]  do  not  allow  HR  channels,  or  subchannels  having  common  zeros,  in  their  problem 
formulation  unlike  this  contribution. 

2.2  Linear  Innovations  Representations 
Assume  further  the  following: 

(H4)  {n;(A:)}  is  zero-mean,  white.  Take  =  1  by  absorbing  any  non-identity  corre¬ 

lation  of  w{k)  into  T{z). 

Lemma  1.  Under  (H1)-(H4),  {s(A:)}  may  be  represented  as 
M 

s{k)  =  -'Y^Dis{k  -  i)  +  Is{k)  (2-3) 

i=l 

where  M  =  Ua  +  Ub  -  1,  Dj’s  are  some  N  x  N  matrices  such  that  det{V{z))  ^  0  for  \z\  >  1, 
V(z)  =  and  {Ia{k)}  is  a  zero-mean  white  iV  X  1  random  sequence  (linear  innovations 

for  {s(/i:)})  with 

E{Is{k)l'^{k)}  =  FoF«  and  llFor=*F?J,(fc)  =  w{k).  •  (2-4) 

Proof:  Consider  the  process 

s'(fc)  :=  Aiz)sik)  =  B{z)wik).  (2-5) 

By  [9]  and  [14],  under  (HI),  (H2)  and  (H4),  we  have 

nj,— 1 

s'(fc)  =  -^D'is'{k-i)  +  J',ik)  (2-6) 

where  D[s  are  some  N  x  N  matrices  such  that  det(I?'(z))  0  for  \z\  >  1,  V'{z)  =  I  + 

and  {I'sik)}  is  a  zero-mean  white  JV  x  1  random  sequence  (linear  innovations  for  {s'{k)})  with 

F{J'(fc)j;^(A:)}  =  BoBS^  =  FoF2^  and  |iFoir'*F5^j;(fc)  =  w{k).  (2-7) 
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Since  s(fc)  =  A-^z)s'{k),  it  follows  from  (2-6)  that  (2-3)  holds  true  with  J,(fc)  =  Ii{k)  such  that 
V{z)  —  A{z)V'{z).  This  completes  the  proof.  □ 


Lemma  2.  Let  TZ„l,  denote  a  [N{L^  -I-  1)]  x  [N{Le  +  1)]  matrix  with  its  ij-th  block  element  as 
Kaaij -i)  =  E{s{k+j -i)s^{k)}.  Then  under  (H1)-(H4), /3(7?.„i,e)  ^  NLg  +  i  for  I-e  >  ria-l-nb-] 
where  p{A)  denotes  the  rank  of  A.  • 

Proof:  It  follows  from  Lemma  1  and  (2-3)  that 


I  Di  •••  0  •••  0 


'E-SsLe  ~ 


(2-8) 


Clearly 


and 


I  Di  •••  D 


ria+Tlb-l 


])  = 


N 


FoF?  0  •••  0  ])  =  1- 

Using  (2-8)-(2-10)  and  Sylvester’s  inequality  [7,  p.  655],  it  follows  that 
pifR-sshe)  A  N  -  N (Le  -b  1)  <  1 


(2-9) 


(2-10) 


(2-11) 


which  yields  the  desired  result.  □ 


3  Blind  Equalization:  No  Common  Zeros 

In  this  section  HR  channels  with  no  common  subchannel  zeros  are  considered.  For  these  channels 
finite  length  inverses  and  zero-forcing  equalizers  exist.  The  main  objective  of  this  paper  is  to  design 
a  blind  MMSE  linear  equalizer  of  a  specified  length  and  delay.  To  this  end,  as  will  become  clear  in 
Sec.  3.2,  we  need  to  consider  the  design  of  a  zero-forcing  zero-delay  linear  equalizer  of  a  specified 
length  which  is  discussed  in  Sec.  3.1. 

Assume  that  assumptions  (H1)-(H4)  hold  true.  In  addition  assume  the  following  regarding  the 
measurement  noise: 

(H5)  {n(fc)}  is  zero-mean  with  E{n{k  +  T)n^{k)}  =  af^lNxN  where  InxN  is  the  JV  X  iV 
identity  matrix. 
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3.1  Zero-Delay  Zero- Forcing  Equalizer 

Using  (2-1)  and  (2-2)  and  setting  T[z)  =  we  have 


OO 


1=0 


1,  m  =  0 

0,  m=l,2,---, 


leading  to 


Go  Gi  •  •  •  Gij^ 


S  = 


1  0  ••• 


(3-1) 


(3-2) 


where  S  is  the  {N{Le  -1- 1))  X  oo  matrix  given  by 
Fo  Fi  F2  Fa  . 

0  Fo  Fi  Fa  .  ^23^ 

0  0  •••  0  Fo  Fi  ••• 

Let  ^  denote  the  pseudoinverse  of  5.  By  [15,  Prop.  1],  ^  =  5^(55^)#.  Then  the  minimum 
norm  solution  to  the  FIR  equalizer  is  given  by  [15,  Sec.  6.11] 


Go  Gi 


1  0 


5^ 


F'^  0  •••  0 


(3-4) 


In  a  fashion  similar  to  TZgaLe  in  Lemma  2,  let  IZyyLc  denote  a  [N(Le  -H  1)]  X  [IV’(Zre  +  1)] 
matrix  with  its  ij-th  block  element  as  Kyy{j  -i)  =  E{y{k  +  j  —  *)y^(^)};  define  similarly  TZmiLe 
pertaining  to  the  additive  noise.  Carry  out  an  eigendecomposition  of  TlyyL. .  Then  the  smallest 
N  -  I  eigenvalues  of  UyyL^  equal  because  under  (H1)-(H4),  p{TlssL,)  <  NLe  +  1  whereas 
piUnnhJ  =  NLe  +  N  =  p{TlyyL^)-  Thus  a  consistent  estimate  of  cr^  is  obtained  by  taking  it 
as  the  average  of  the  smaUest  JV  -  1  eigenvalues  of  UyyL,,  the  data-based  consistent  estimate  of 

'R-yyLe  ■ 

Under  (H4)  and  (H5), 

{S  <S  )  =  T^sshe  “  '^yy^e  “  '^nnLe  ”  '^yyLc 
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Thus,  (SS^)  can  be  estimated  from  noisy  data.  However,  we  don’t  know  Fq.  To  this  end,  we  seek 
an  JV  X  iV  FIR  filter  Qaiz)  :=  Efco  Gai-z"’  satisfying 

(3-6) 


GaO  Gal 


GaL. 


InxN  0 


ni 


sLff 


Comparing  (3-4)  and  (3-6)  it  follows  that 


Go  Gi 


Gi,. 


_  -pTi 

-  -Co 


GaO  Gal 


GaL- 


(3-7) 


leading  to 

=:  Giz)  =  F^Gaiz).  (3-8) 

i=0 

In  practice,  therefore,  we  apply  Ga{^)  fo  the  data  leading  to 

■v{k)  :=  Qaiz)yik)  =  -Vsik)  +  ^a(2)n(A:)  (3-9) 


such  that 

Fl^Vgik)  =  w{k)  (3-10) 

where 

Vg(k)  :=  Qa{z)  [y{k)  -  n(A:)]  =  Ga(z)s{k).  (3-11) 

In  (3-10)  {^o(A:)}  is  a  white  scalar  sequence  (by  assumption  (H4)),  however,  {va(A:)}  is  not 
necessarily  a  white  vector  sequence.  Given  the  second-order  statistics  of  {v,(A:)},  how  does  one 
estimate  Fq  so  that  {w{k)}  satisfying  (H4)  is  recovered?  We  need  to  have  Rww{r)  :=  E{w{k  -f 
T)w*{k)}  =  0  for  |t|  7^  0.  By  (3-9),  ie,„^(r)  =  F'^R,,^.{t)Fo-  Define  (i  >  0  is  some  large  integer) 

R....  ■■= 

where  Rv.v.i'r)  ■=  E{v3{k  -f  T)v^{k)}. 

Lemma  3.  Rv,v,  is  rank  deficient  for  any  i  >  1  such  that  iZ„,„,Fo  =  0.  • 

Proof:  We  have 

R...{r)  =  E{w{k  +  r)vY{k)}  -  0  Vr  >  1  (3-13) 
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because  Va{k)  is  obtained  by  causal  filtering  of  y{k),  hence  of  w{k).  Using  (3-10)  in  (3-13)  it  then 
follows  that  there  exists  a  JV  x  1  Fq  0  such  that 

F?i2„,„,(T)  =  0  Vt>1.  (3-14) 

Equivalently,  we  have  from  (3-14) 

£„^„,(-r)Fo  =  0  Vr>l.  (3-15) 

The  desired  result  is  then  immediate.  □ 

Pick  a.  N  Xl  column-vector  Ho  to  equal  the  rightmost  right  singular  vector  in  a  singular- value 
decomposition  (SVD)  i.e.  the  right  singular  vector  corresponding  to  the  smallest 

singular  value.  In  other  words,  pick  Ho  to  equal  the  last  column  of  V .  Then  since  ideally  the 
smallest  singular  value  of  Rv^v^  is  zero,  we  have  RvsVs{^'^)^o  =  0  for  r  =  1,2,  •  •  -  jlr.  This,  in 
turn,  implies  that 

(HS^ii„,„,(-T)Ho)^  =  HS^E„,„,(r)Ho  =  0  for  t=1,2,---,X.  (3-16) 

Since  the  overall  system  with  w{k)  as  input  and  ilQVsik)  as  output  is  ARMA(na,n.b  +  Le),  it 
follows  that  HQ^v,(fc)  is  zero-mean  white  if  i  >  nb  +  Le,  hence,  a  scaled  version  of  m(fc).  Therefore, 

we  have  (a  ^  0) 

IL^Vsik)  =:  w'{k)  =  a'w{k)  (3-17) 

(because  =  0).  Thus,  once  Ho  is  found,  one  has  the  complete  inverse  filter  to  recover  a 

scaled  version  of  w{k)  via  a  zero-torcing  filter. 

Remark  2:  In  [1]  F'^  in  (3-4)  has  been  replaced  with  an  iV-row  vector  [1  0  •  •  •  0].  This 
requirement  of  [1]  is  not  restrictive  for  single-receiver  causal  systems  with  fractional  sampling  as 
one  can  always  achieve  this  by  “shifting,”  i.e.  “re-grouping”  of  fractional  samples  per  symbol, 
set  /i(0)  ^  0  but  /»(0)  =  0  for  i  =  2,3,---,JV.  It  does  demand  symbol  synchronization  so 
that  fractional  samples  belonging  to  a  given  symbol  are  known  thereby  allowing  for  shifting  or 
re-grouping  to  achieve  the  aforementioned  requirement.  In  this  paper  we  don’t  require  such  a 
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synchronization;  only  the  baud  rate  ought  to  be  known.  The  approach  of  [1]  does  not  apply  to 
antenna  arrays  whereas  our  approach  does.  □ 

Remark  3;  If  the  noise  is  colored  with  known  color  except  for  a  scalar  scale  factor,  then  we 
can  follow  prewhitening  (as  in  [5])  and  convert  the  problem  to  one  that  obeys  (H5).  □ 

Remark  4:  Fq  can  also  be  estimated  (up  to  a  scale  factor  as  unit  norm  Hq)  using  the 

prediction  error  method  of  [9], [14]  (even  though  [9]  and  [14]  restrict  their  discussion  to  FIR  models 
and  real-valued  data).  Using  (2-3)  we  obtain  (Te  ^  “  1) 

Di  D2  •••  Dl,  T^ss{Le-i)  ~  ~  1^55(1)  R53(2)  •••  Rjj(ie)  (3  18) 

leading  to  the  minimum  norm  solution 

Di  D2  •  •  •  Dl.  =  -  R,5(1)  ^-55(2)  •  •  •  Rs,(ie) 

Note  that  li  +  rib  -  1,  then  Di  =  0  for  all  i  >  n„  -b  nt  -  1  by  virtue  of  Lemma  2.  By  (2-3) 

and  (2-4)  we  also  have 

R„(0)  ~  £{/,(/i)/?(fc)}  =  F„F?  =  R„(0)  +  i;DiR„(-i).  (3-20) 

i  =  l 

Clearly  /3(Rjs(0))  =  1.  Carry  out  an  eigendecomposition  of  R//(0).  Pick  Ho  as  the  unit  norm 
eigenvector  corresponding  to  the  largest  eigenvalue  (ideally  the  only  nonzero  eigenvalue)  of  Rjj(O). 
□ 

Remark  5:  It  is  worth  noting  that  although  F2^v,(fc)  =  w{k)  (see  (3-10))  and  ||Fo||“^F5^/a(A:)  = 
w{k)  (see  (2-4)),  {I,{k)}  is  zero-mean  white  (linear  innovations)  whereas  {vj(A:)}  is  in  general  col¬ 
ored.  □ 


3.2  MMSE  Equalizer  with  Delay  d 

We  wish  to  design  an  MMSE  (minimum  mean-square  error)  linear  equalizer  of  a  specified  length. 
It  is  not  too  hard  to  establish  (using  the  orthogonality  principle  [16],  for  example)  that  the  MMSE 
equalizer  of  length  Lg  +  1  to  estimate  w{k  —  d)  {d>  0)  based  upon  y(n),  n  =  k,k  —  1,  -  ■  ■  ,k  —  Lg, 
satisfies 


G<fn  Gd.i 


Gd.Le 


••  F«  0  0 


(3-21) 
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where  Tlyyi,  has  its  ij-th  block-element  given  by  Rj,y(i  -  i).  Clearly  one  can  obtain  a  consistent 
estimate  of  Tlyyhe  from  the  given  data.  It  remains  to  estimate  F/’s  to  complete  the  design.  Here 
the  discussion  of  Sec.  3.1  becomes  relevant.  There  we  found  a  Ho  to  satisfy  (3-17).  From  (3-9)  and 
(3-17)  we  have 

H?v,(fc)  =  |:H«G,is(»-i).  P-22) 

i=0 

Using  (3-22)  and  taking  expectations  we  have 

Lc 

E{s{n}\^{n  —  'r)}Ho  =  Rss('’^  +  i)G^Ho.  (3-23) 

1=0 

Using  (1-2)  and  (3-17)  we  have 

E{s{n)v'^in  -  r)}Ho  =  aF^.  (3-24) 


Hence,  we  have  from  (3-23)  and  (3-24) 


F?  =  a-'H«|iG..R2(T  +  i).  P-25) 

Let  1ld,3sLe  denote  a  [N{Le  +  1)]  X  [N{Le  +  1)]  matrix  with  its  i;-th  block  element  as  E{s{k  +  d  + 
j  -  i)s^{k)}.  Then  (3-25)  can  be  expressed  as 


Fj 


F?-i 


F«  0 


—  a  Hq 


GaO  Gal 


GaL. 


'e 


'^d^SsLe  ’ 


(3-26) 


Finally,  using  (3-6)  and  (3-26)  in  (3-21)  we  obtain  the  desired  solution 


Gd,o  Gd,i 


Gd,Le 


=  a 


InxN  0  •••  0  ’^T^fsLe'^^.ssLe'^yyLe-  (3-27) 


The  MMSE  estimate  w{t  -  d)  of  w{t  -  d)  is  then  given  by 


w{t  -  d)  =  ^  Gd,iy{t  -  i) 


t=0 


In  practice,  since  a.  is  unknown,  one  obtains  a  scaled  version 


(3-28) 


w{t  -  d)  =  ^  aG<i,i  y{t  -  i)  =  OLw{t  -  d).  (3-29) 

i=0 

3.3  Algorithms:  Practical  Implementation 

Given  data  y{k),  fc  =  1,2,  •  •  •,r.  Pick  the  length  Le  -f  1  and  delay  d  of  the  MMSE  equalizer. 
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3.3.1  ALGORITHM  I  : 


Here  Fq  is  estimated  as  the  unit  norm  Ho  that  hes  in  the  null  space  of  Rv,vs  ■ 


I.l  Estimate  the  correlation  function  of  the  measurements  at  lag  m  as 


Rj/j,(m)  =  ^Yly{k  +  m)y'^ik)  (3-30) 

^  fe=i 

where  we  take  y{k  +  .m)  =  0ifA:-|-m<  lor  >T.  Define  the  [N{Le  -f  1)]  x  [N{Le  -f  1)] 
matrix  HyyL,  with  its  ij-th  block  element  as  Ryy{j  -  i).  Carry  out  an  eigendecomposition 
of  HyyL^ .  Let  Xi{i  =  NLe  +  2,---,NLe  +  N)  denote  the  smallest  N  -  1  eigenvalues  of  HyyL. . 
Estimate  the  noise  variance  as 


1  NLe+N 

=  — ^  y  Xi- 


NLe+N 

.  E 

l=i\r£/eH"2 

The  signal  correlation  function  at  lag  m  is  then  estimated  as 


(3-31) 


R«(wi)  =  ^(»T^)  -  ^IInxnK'^) 


(3-32) 


where  S{m)  is  the  Kronecker  delta  function.  Define  the  [N{Le  +  1)]  X  [N{Le  +  1)]  signal 
correlation  matrix  estimate  'R-ssLe  with  its  ij— th  block  element  as  Rsa(j  —  i)- 


1.2  Now  we  implement  (3-6).  First  we  need  to  calculate  Carry  out  a  singular  value 

decomposition  oiUssLe  leading  to  n^Le  -  USV’^  where  S  =  diag{si,i  =  l,2,---,iVie  + 
N}.  The  rank  ni  of  UesLe  is  determined  as  the  smallest  n  for  which 


9n 


NLe+N 
n-fl 


Si 


y^NLe+N  . 


-  <ei 


(3-33) 


where  €i  >  0  is  a  small  number.  [For  simulations  presented  in  Sec^  5  we  took  ei  =  0.001]. 
The  desired  pseudoinverse  is  then  calculated  as 


Kill!  =  visr'u? 


.  (3-34) 
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where  Si  =  diag{si,i=  1,2,  •  •  -  ,711}  and  Ui  and  Vi  are  comprised  of  the  left  and  the  right 
(respectively)  singular  vectors  corresponding  to  the  singular  values  retained  in  Si.  Using 
(3-34)  calculate 


GaO  Gal 


GaLe 


InxN  0  ■  •  •  0 


(3-35) 


1.3  Using  (3-11)  estimate  R„,„,(m)  as 


Le  Le  ^  ^  ^ 

^,u,(”l)  =  X)  XI  +  ^2  -  h)Gal2 

ly=0  l2=0 

where  Raa(77i)  has  been  discussed  in  (3-30)-(3-32).  Define  the  (ilV)  X  N  matrix 

Carry  out  an  SVD  of  Rv,v,  S'lid 

Ho  =  ‘rightmost’  right  singular  vector  of  Rv,v,- 


(3-36) 


(3-37) 


(3-38) 


1.4  Define  the  [iV(i<,  + 1)]  X  [iV(ie  + 1)]  matrix  nd,ssLe  with  its  iy-th  block  element  as  R„(i-i). 
The  MMSE  equalizer  with  delay  d  is  calculated  as 


Gd^o  Gd.i 


Gd,L. 


Inxn  0  •  •  •  0 


(3-39) 


In  (3-39)  is  the  pseudoinverse  of  Ti-ssLe  calculated  as  in  (3-34)  except  that  a  larger  error 

threshold  €2  is  used  in  (3-33)  instead  of  €1.  The  rank  nj  of  is  determined  as  the  smallest 
n  for  which  <  £2  in  (3-33)  instead  of  gn  <  ei.  [For  simulations  presented  in  Sec.  5  we  took 
£2  =  0.01]. 


Remark  6.  In  (3-39)  calculation  of  is  related  to  computation  of  some  of  the  leading  co¬ 
efficients  of  the  channel  impulse  response  whereas  in  (3-35)  calculation  of  is  related  to  the 

calculation  of  the  null  space  of  Heuristically,  a  higher  value  of  e  in  (3-33)  leads  to  higher 

“intersymbol  interference”  but  lower  “noise  enhancement”  in  a  zero-forcing  equaUzer  design,  and 
vice-versa.  In  estimating  Ho  via  (3-38)  suppression  of  intersymbol  interference  is  more  important 
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in  order  to  ‘better  define’  the  null  space  of  Rv,v,-  contrast,  in  (3-39)  (also  recall  (3-6)  and 
(3-26))  one  requires  a  compromise  between  intersymbol  interference  and  noise  enhancement  while 
estimating  some  of  the  channel  impulse  response  coefficients.  D 


3.3.2  ALGORITHM  II  : 

Here  Fq  is  estimated  as  in  Remark  4. 


II.  1  Repeat  step  I.l  of  Algorithm  I. 

II.2  Calculate  the  pseudoinverse  ^as(Le-i)  ^  Algorithm  I.  Calculate 


Di  £>2 


Dl, 


R„(l)  R„(2)  R„(ie) 


'S5(l,-i)- 


(3-40) 


Further  calculate 

Kn{0)  =  R«(0)  +  'ZBiKU-i)- 

t=l 

Set  Ho  equal  to  the  unit  norm  eigenvector  corresponding  to  the  largest  eigenvalue  of  R//(0). 

II.3  Repeat  step  1.4  of  Algorithm  I  with  Ho  obtained  from  step  II.2. 


3.3.3  ALGORITHM  III  : 

Here  we  will  use  (3-21)  with  F.-  (i  =  0, 1,  •  •  • ,  d)  estimated  using  the  basic  approach  of  [9]  and  [14]. 
Although  [9]  and  [14]  derive  all  their  results  under  the  assumption  of  FIR  channels  with  no  common 
zeros,  their  results  extend  (with  straightforward  modifications)  to  models  that  satisfy  (H1)-(H5) 
by  virtue  of  Lemma  1,  By  (2-4),  we  have 

w{k)  =  ||F„||-“F«/,(li)  =  l|Fo|r'H?F,(*)-  (3-42) 

By  (1-2)  and  (1-4)  it  follows  that 

s{k)  = 

t=0 
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From  (3-43)  and  (H4),  we  have  the  relations 
E{w{k  -  l)s'^ik)}  =  for  I  >  0. 
From  (2-3)  and  (3-42),  we  have  the  relations 


E{w{k-l)s'^{k)}  =  lIFoir^H^^  Rssi-l)  +  -  i)  ■ 

L  t=i 

From  (3-44)  and  (3-45)  it  follows  that 

fY  =  ||Fo|r'H?[R:i(0  +  i;DiRll(/  +  i)  . 


Based  upon  the  above  discussion,  [9]  and  [14],  we  have  the  following  algorithm: 
III.l  Repeat  step  I.l  of  Algorithm  I. 

IIL2  Repeat  step  II.2  of  Algorithm  II. 

III.3  Estimate  FJ^  up  to  a  scale  factor  as 


(3-47) 


III.4  The  MMSE  equalizer  of  length  Le  +  l  and  with  delay  d  is  calculated  (up  to  a  scale  factor)  as 


(3-44) 


(3-45) 


(3-46) 


Gd,o  Gd,i  •••  Ga,L.  =  Fj  F'^_i  •••  F^  0  •••  0  (3-48) 


4  Blind  Equalization:  Common  Zeros 

Now  we  allow  common  subchannel  zeros.  In  this  case  since  ideally  we  need  infinite  length  inverses 
and  zero-forcing  equalizers,  the  presented  results  hold  true  only  approximately  for  finite  length 
equalizers.  Assume  that  (H1)-(H5)  hold  true. 

4.1  Minimum- Phase  Zeros 

Here  the  SIMO  transfer  function  is 

(4-1) 

where  13{z)  satisfies  (H2)  and  Bc{z)  is  a  finite-degree  scalar  polynomial  that  collects  all  the  common 
zeros  of  the  subchannels.  Assume  that 
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(H6)  Given  model  (4-1),  Bc{z)  0  for  \z\  >  1. 

Then  while  A~^{z)B{z)  has  a  finite  inverse,  is  HR  though  causal  under  (H6).  Then  (3-2) 

holds  true  approximately  for  “large”  Le,  the  approximation  getting  better  with  increasing  Le-  Simi¬ 
larly  Lemma  1  holds  true  approximately  for  “large”  M  and  Lemma  2  also  holds  true  approximately 
for  Le  >  M.  It  is  then  readily  seen  that  the  developments  of  Secs.  3.1,  3.2  and  3.3  apply  to  the 

current  case  also. 


4.2  Arbitrary  Zeros 

In  this  case  (4-1)  is  true  but  Bc{z)  does  not  necessarily  satisfy  (H6).  We  may  rewrite  (4-1)  as 
J^{z)  =  Jiz)J^Ap{z) 

where  Tap{z)  is  an  allpass  (rational)  function  such  that 

Be{z)Be{z-^)  =  :FApiz:WMP{z) 

and  Bmp{z)  is  minimum-phase.  Thus  (within  a  scale  factor)  we  have 
?(.)  = 

We  may  rewrite  (1-2)  as 

y{k)  =  T{z)w'{k)  -H  n(fc) 
where 

w'{k)  ;=  TAp{z)w{k). 

Clearly  w\k)  satisfies  (H4).  Hence,  (4-4)-(4-6)  satisfy  the  requirements  of  Sec.  4.1.  Therefore,  one 
can  “approximately”  recover  w\k)  from  the  given  data  by  applying  the  algorithms  of  Sec.  3.3. 

In  order  to  recover  w{k)  form  w\k),  one  needs  to  exploit  the  higher-order  statistics  of  {w'{k)}; 
see  [2], [3]  and  references  therein. 
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5  Simulation  Examples 

Here  we  consider  two  simulation  examples  to  illustrate  the  proposed  approaches.  Both  of  the 
examples  are  modified  versions  of  the  example  from  [5].  Example  1  consists  of  an  ARM  A  model 
whose  MA  part  is  taken  from  [5].  Example  2  consists  of  an  MA  (FIR)  model  where  we  augment 
the  FIR  channel  of  [5]  with  a  zero  at  0.5  where  this  zero  is  common  to  all  of  the  four  subchannels. 

For  computing  (3”34)  via  SVD,  we  picked  €\  =  0.001  in  (3-33).  For  computing 

in  (3-39),  or  in  (3-40),  via  SVD,  we  picked  63  =  0.01  in  (3-33).  Moreover,  in  (3-39) 

and  (3-48)  was  also  computed  using  SVD  where  all  singular  values  smaller  than  0.001  x(largest 
singular  value)  were  neglected.  Thus,  calculation  of  HyyL^  was  regularized.  The  measurement  SNR 
is  defined  as 

SNR  = 

-EliHlniim' 

The  normalized  MSE  (i.e.,  MSE  divided  by  E;{|ty(/c)H)  and  the  probability  of  detection  error 
(Pe)  after  equalization  were  taken  as  the  two  performance  measures  after  averaging  over  100  Monte 
Carlo  runs.  The  equalized  data  were  rotated  and  scaled  before  calculating  the  two  performance 
measures.  After  designing  the  equalizers  based  on  the  given  data  record,  the  designed  equalizer 
was  applied  to  an  independent  record  of  length  3000  symbols  in  order  to  calculate  normalized  MSE 
and  Pg.  Therefore,  the  estimated  Pg  is  not  reliable  below  approximately  10“'‘,  hence,  these  values 
are  not  shown  in  Figs.  2  and  4. 

5.1  Example  1. 

We  have  JV  =  4  in  (1-2)  with  P’iz)  =  A~'^{z)B{z)  where 

^(z)  =  (1-0.5z-^)73x3  , 

and  B(z)  is  4  X  1  with  its  z-th  element  given  by 

Bi{z)  =  (-0.049  -b  iO.359)  +  (0.482  -  i0.569)z“^ 

-|-(-0.556  -b  i0.587)z-^  +  (1.0  +  i0.0)z"^  -b  (-0.171  +  j0.061)z"'‘ 
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Biiz)  =  (0.443  -  j0.0364)  +  (1.0  +  i0.0)z-^ 

+(0.921  -  jO. 194)2"^  +  (0.189  -  i0.208)z"^  +  (-0.087  -  j0.054)z"^ 

B^iz)  =  (-0.211  -j0.322)  +  (-0.199 +  ;0.918)z“^ 

+(1.0  +  j0.0)z~^  +  (-0.284  -  ;0.524)z“^  +  (0.136  -  j0.190)z“'* 

Ba{z)  =  (0.417  +  j0.030)  +  (1.0  +  i0.0)z“^ 

+(0.873  +  j0.145)z-"  +  (0.285  +  i0.309)z-^  +  (-0.049  +  jO.iei)^-^  (5-2) 

The  MA  part  B{z)  is  the  same  as  the  FIR  channel  of  [5].  The  scalar  input  w{k)  is  4-QAM  (as  in 

[5]). 

Transfer  function  B{z)  satisfies  (H2)  [5],  therefore,  there  exists  a  finite  left  inverse  of  length 
Te  =  4  (cf.  Sec.  2.1).  An  MMSE  equalizer  of  length  Le  =  12  (13  taps  per  subchannel,  totaling 
52  taps:  substantial  overfitting!!)  was  designed  with  a  delay  d  -Z  (arbitrarily  selected  just  for 
illustration).  The  Algorithms  I-III  were  applied  for  various  record  lengths.  The  equalized  output 
was  scaled  to  match  the  true  {w{k)}  before  computing  the  mean-square  error  (MSE)  in  the  equalized 
output.  Fig.  1  shows  the  normalized  MSE  and  Fig.  2  shows  the  probability  of  error  Pe,  both  averaged 
over  100  Monte  Carlo  runs.  It  is  seen  that  the  proposed  design  approaches  can  handle  HR  channels 
with  little  difficulty.  Algorithm  II  (newly  proposed)  performs  the  best  with  Algorithm  III  (based 
upon  [9]  and  [14])  being  almost  as  good.  The  performance  of  Algorithm  I  improves  with  increase 
in  record  length  and  it  approaches  that  of  the  other  two  algorithms  for  T  =  1000  symbols. 

5.2  Example  2. 

Again  we  have  JV  =  4  in  (1-2)  but  with  P(z)  =  Bc{z)B{z)  where  B{z)  is  as  in  Example  1  and  B^z) 
is  a  scalar  polynomial  given  by 

Bc{z)  =  l-0.5z-^ 

Thus  all  four  subchannels  have  a  common  zero  at  0.5.  The  input  w{k)  is  4-QAM  as  in  Example  1. 
Note  that  in  this  example  a  finite  left  inverse  does  not  exist. 
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As  in  Example  1,  an  MMSE  equalizer  of  length  =  12  was  designed  with  a  delay  d  =3.  Fig. 

3  shows  the  normalized  MSE  and  Fig.  3  shows  the  probability  of  error  Fg,  both  averaged  over  100 
Monte  Carlo  runs.  It  is  seen  that  the  proposed  design  approaches  can  handle  subchannels  with 
common  minimum-phase  zeros  with  little  difficulty.  As  in  Example  1,  Algorithm  II  perfoims  the 
best. 

6  Conclusions 

Direct  blind  MMSE  equalization  of  SIMO  channels  using  only  the  second-order  statistics  of  the 
data  was  considered.  Such  channels  arise  when  antenna  arrays  are  used  or  when  signals  with  excess 
bandwidth  are  fractionally  sampled  or  when  both  these  scenarios  are  applicable.  Unlike  the  past 
work  on  this  problem  [4], [5], [8]- [14],  the  proposed  solutions  are  applicable  to  HR  channels  and  to 
SIMO  systems  having  common  zeros  among  the  various  subchannels  so  long  as  the  common  zeros 
are  minimum-phase.  In  case  of  nonminimum-phase  zeros,  we  recover  an  allpass  filtered  version  of 
the  original  input. 

Three  approaches  were  proposed.  Algorithms  I  and  II  are  inspired  by  [1]  whereas  Algorithm 
III  is  a  straightforward  extension  of  [9]  and  [14].  Algorithm  II  also  exploits  some  results  from  [9] 
and  [14].  Two  illustrative  simulation  examples,  one  consisting  of  an  HR  channel  and  the  other 
consisting  of  an  FIR  channel  with  a  common  zero,  were  presented  using  a  4-QAM  information 
sequence.  The  proposed  approaches  work  well.  Algorithm  II  works  the  best  (evaluated  in  terms  of 
mean-square  error  and  probability  of  detection  error  after  equalization)  with  Algorithm  III  being 
a  close  second. 

Future  work  includes  performance  analysis,  adaptive  implementation  and  extension  to  MIMO 
scenarios  involving  more  than  one  information  signals. 
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FIGURE  CAPTIONS 

Fig.  1.  Example  1:  Normalized  MSE  after  equalization  for  various  record  lengths  (T)  and  SNR’s, 
averaged  over  100  Monte  Carlo  runs. 

Fig.  2.  Example  1:  Probability  of  error  after  equalization  for  various  record  lengths  (T)  and  SNR’s, 
averaged  over  100  Monte  Carlo  runs. 

Fig.  3.  Example  2:  Normalized  MSE  after  equalization  for  various  record  lengths  (T)  and  SNR’s, 
averaged  over  100  Monte  Carlo  runs. 

Fig.  4.  Example  2:  Probability  of  error  after  equalization  for  various  record  lengths  (T)  and  SNR’s, 
averaged  over  100  Monte  Carlo  runs. 
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EXAMPLE  1 :  HR  channel,  4-QAM  signal 
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EXAMPLE  2:  channel  with  common  zero,  4-QAM  signal 
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EXAMPLE  2:  channel  with  common  zero,  4-QAM  signal 
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Abstract 

The  problem  of  blind  equalization  of  SIMO  (single-input  multiple-output)  communications  chan¬ 
nels  is  considered  using  only  the  second-order  statistics  of  the  data.  Such  models  arise  when  a  single 
receiver  data  is  fractionally  sampled  (assuming  that  there  is  excess  bandwidth),  or  when  an  an¬ 
tenna  array  is  used  with  or  without  fractional  sampling.  We  extend  the  multistep  Hnear  prediction 
approach  to  infinite  impulse  response  (HR)  channels  as  well  as  to  the  case  where  the  subchannel 
transfer  functions  have  common  zeros.  In  past  this  approach  has  been  confined  to  finite  impulse  re¬ 
sponse  (FIR)  channels  with  no  common  subchannel  zeros.  We  focus  on  direct  design  of  finite-length 
MMSE  (minimum  mean-square  error)  blind  equalizers.  Knowledge  of  the  nature  of  the  underlying 
model  (FIR  or  HR)  or  the  model  order  is  not  required.  Our  approach  works  when  the  “subchan¬ 
nel”  transfer  functions  have  common  zeros  so  long  as  the  common  zeros  are  minimum-phase  zeros. 
Illustrative  simulation  examples  are  provided. 
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1  Introduction 


Consider  a  discrete-time  SIMO  (single-input  multiple-output)  system  with  N  outputs  and  one  input. 
The  i-th  component  of  the  output  at  time  k  is  given  by 

yi{k)  =  Ti{z)w{k)  +  Uiik) ,  i  =  1, 2,  ■  •  fV,  (1-1) 

y{k)  =  J^(z)w{k)  +  n{k)  =  s(A:) -f  n(A:),  (1-2) 

where  y(k)  =  [yi{k):y2{k):  -  ■  •:yN{k)V ,  similarly  for  s{k)  and  n(fc),  and  z  is  the  ^-transform 

variable  as  well  as  the  backward-shift  operator  (i.e.,  z~''-w{k)  =  w{k  -  1),  etc.).  The  sequence  w(k) 
is  the  (single)  input  at  sampling  time  k,  yi{k)  is  the  i-th  noisy  output,  Si{k)  is  the  i-th  noise-free 
output,  ni{k)  is  the  additive  measurement  noise,  and 

Fi{z)  :=  (1-3) 

1=0 

is  the  scalar  transfer  function  with  w{k)  as  the  input  and  yi{k)  as  the  output;  it  represents  the 
i-th  subchannel.  We  allow  all  of  the  above  variables  to  be  complex- valued.  The  overall  transfer 
function  is  denoted  by  the  x  1  F{z)  with  its  z— th  element  as  Tiiz).  We  have 

F{.z)  =  f^FiZ-\  (1-4) 

i=0 

Such  models  arise  in  several  useful  baseband-equivalent  digital  communications  and  other  appli¬ 
cations.  A  case  of  some  interest  is  that  of  fractionally-spaced  samples  of  a  single  baseband  received 
signal  leading  to  a  SIMO  model  [1],[4],[8].  Alternatively,  a  similar  model  can  be  derived  when  we 
have  a  single  signal  impinging  upon  an  antenna  array  with  N  elements  [5].  A  similar  model  arises 
if  we  have  an  antenna  array  coupled  with  fractional  samphng  at  each  array-element  [5]. 

In  these  applications  one  of  the  objectives  is  to  recover  the  inputs  w(^k)  given  the  noisy  mea¬ 
surements  but  not  given  the  knowledge  of  the  system  transfer  function.  Recently  there  has  been 
much  interest  in  solving  this  problem  using  only  (or  at  least,  to  the  maximum  extent  possible) 
the  second-order  statistics  (SOS)  of  the  data  (see  [1],  [3]-[5],  [8]-[14]  and  references  therein).  The 
solution  is  closely  tied  to  existence  of  an  FIR  (finite  impulse  response)  inverse  to  the  system  transfer 
function  [1],  [3]-[5],  [8]-[14].  An  overwhelming  number  of  papers  (see  [4],[5],[9]-[12]  and  references 
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therein)  have  concentrated  on  a  two-step  procedure:  first  estimate  the  channel  impulse  response 
(IR)  and  then  design  an  equalizer  using  the  estimated  channel.  A  fundamental  restriction  in  these 
works  is  that  the  channel  is  FIR  with  no  common  zeros  among  the  various  subchannels.  A  few  (see 
[l]a.nd  [13].  e.g.)  have  proposed  direct  design  of  the  equalizer  bypassing  channel  estimation.  Still 
they  assume  FIR  channels  with  no  common  zeros. 

In  this  paper  we  allow  HR  (infinite  impulse  response)  channels  (which  are  finitely  parametrized). 
We  will  also  allow  common  zeros  so  long  as  they  are  minimum-phase  (i.e.,  they  lie  inside  the  unit 
circle).  Finally,  in  the  presence  of  nonminimum-phase  common  zeros,  our  proposed  approach 
equalizes  the  spectrally-equivalent  minimum-phase  counterpart  of  T{z);  it  does  not  “fall  apart” 
unlike  quite  a  few  existing  approaches.  Our  proposed  approach  is  inspired  by  that  of  [10]  and  [12] 
which  have  been  derived  and  analyzed  therein  only  for  FIR  channels  with  no  common  zeros.  The 
basis  for  the  proposed  approach  is  multistep  linear  prediction.  A  one-step  linear  prediction-based 
approach  was  first  proposed  in  [8]  and  later  expanded  upon  in  [9]  and  [14].  Unlike  the  subspace- 
based  methods  of  [4],  [5],  [11]  and  others  (see  also  [3]  and  references  therein),  the  Unear  prediction 
(LP)  based  approach  of  [8],  [9]  and  [14]  turns  out  to  be  rather  insensitive  to  the  order  of  the 
underlying  FIR  channel  (so  long  as  one  overfits).  More  recently,  it  has  been  pointed  out  in  [10]  and 
[12]  that  the  LP-based  approach  can  be  further  significantly  improved  by  utiUzing  some  additional 
information  not  exploited  by  LP.  Although  [10]  and  [12]  derive  their  algorithms  in  a  quite  a  different 
manner,  their  final  algorithms  are  essentiaUy  the  same.  In  this  paper  we  will  foUow  the  approach 
of  [12]  which  is  based  upon  multistep  Unear  prediction.  As  noted  earUer,  unUke  [12]  we  aUow  HR 
channels  and  common  zeros. 

Two  approaches  are  discussed  in  this  paper  for  designing  a  bUnd  MMSE  (minimum  mean-square 
error)  Unear  equaUzer  of  a  specified  length  and  delay.  The  approaches  do  not  require  the  knowledge 
of  the  underlying  system  model  orders  or  IR  length,  nor  do  they  require  the  knowledge  of  the  nature 
of  the  model  (FIR  or  HR).  Algorithm  I  is  novel  and  is  inspired  by  [10]  and  [12]  whereas  Algorithm 
II  is  a  straightforward  extension  of  [9]  and  [14],  and  it  was  first  proposed  in  [18].  Note  that  the 
prediction  error  methods  of  [8],  [9]  and  [14]  apply  to  the  problem  under  consideration  with  some 
straightforward  extensions/modifications  (as  we  discuss  in  Sec.  3.4.2).  Interestingly,  [8],  [9]  and  [14] 
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derive  their  results  under  the  assumption  of  FIR  channels  with  no  common  zeros.  Although  our 
emphasis  is  on  MMSE  equalization,  estimation  of  a  leading  part  of  the  underlying  channel  IR  is  an 
essential  part  of  this  paper.  For  MMSE  equalization  with  a  given  delay  d,  it  is  sufficient  to  estimate 
the  channel  IR  at  first  d  +  I  samples,  which  is  what  is  done  in  this  paper.  Clearly,  if  channel  IR. 
estimation  is  the  objective,  then  one  can  pick  a  ‘large’  value  of  d. 

The  paper  is  organized  as  follows.  Precise  model  assumptions  and  some  background  results  used 
later  in  the  paper  are  stated  and  developed  in  Sec.  2.  HR  channels  with  no  common  subchannel 
zeros  are  considered  in  Sec.  3  where  the  proposed  algorithm  of  this  paper  is  developed.  Under 
the  assumptions  of  Sec.  3,  finite  length  inverses  and  finite-length  multistep  linear  predictors  exist. 
In  Sec.  4  we  allow  common  subchannel  zeros.  Here  ideally  we  need  infinite  length  inverses  and 
multistep  linear  predictors.  Three  computer  simulation  examples  involving  a  4- QAM  signal  are 
presented  in  Sec.  5  to  illustrate  the  performance  of  the  proposed  approach  and  compare  it  with 
that  of  the  linear  prediction  approach. 

2  Model  Assumptions  and  Preliminaries:  No  Common  Zeros 

In  this  section  we  consider  precise  model  assumptions  and  some  background  results  used  later  in 
the  paper.  In  Secs.  2  and  3  we  focus  on  systems  with  no  common  subchannel  zeros.  The  case  of 
common  zeros  is  discussed  in  Sec.  4. 

Assume  the  following; 

(HI)  J^{z)  =  A-^{z)B{z)  where  A{z)  =  1  -b  E?=i  is  1  x  1, 13{z)  =  ESo  is  iV  x  1 

and  JV  >  1. 

(H2)  Rank{H(2)}  =  1  Va  including  z  =  oo  but  excluding  z  =  0,  i.e.,  B{z)  is  irreducible  [7, 
Sec.  6.3]. 

(H3)  A(z)  ^  0  for  \z\  >  1. 

(H4)  {w{k)}  is  zero-mean,  white.  Take  jS{|iy(fc)|2}  =  1  by  absorbing  any  non-identity  corre¬ 
lation  of  'w{k)  into  .F(z). 
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(H5)  {n(A:)}  is  zero-mean  with  E{n{k  -f  T)n^{k)}  =  alluxN  where  InxN  is  the  iV  x 

identity  matrix  and  the  superscript  H  is  the  Hermitian  operator  (complex  conjugate 
transpose). 

Assumption  (H2)  is  equivalent  to  stating  that  the  various  subchannels  Fi{z)  have  no  common  zeros. 
It  has  been  shown  in  [6]  (using  some  results  from  [2])  that  under  (Hl)-(H3)  there  exists  a  finite 
degree  left-inverse  (not  necessarily  unique)  of  F{z): 

g{z)T{z)  =  1  (2-1) 

where  Q(^z)  is  1  X  N  given  by 

g(z)  =  for  any  >  ria  +  Ub  -  1-  (2-2) 

1=0 

Remark  1:  The  left-inverse  Q(z)  of  J^{z)  consists  of  two  parts:  Q{z)  -  gB{z)A{z)  where 
gB(,z)B{z)  =  1  so  that  g{z)T{z)  =  gB{z)A(z)A-\z)Biz)  =  gB{z)13{z)  =1.  Finite  length  left- 
inverses  of  FIR  SIMO  channels  have  been  subject  of  intense  research  activities  [4]-[6],[8]-[13].  Left- 
inverses  to  MIMO  IIR/FIR  channels  have  been  considered  in  [6].  It  appears  that  the  results  of  [6] 
pertaining  to  MIMO  models  are  the  sharpest  to  date.  Finally,  it  is  important  to  stress  that  [4], 
[5]  and  [8]-[13]  do  not  allow  HR  channels,  or  subchannels  having  common  zeros,  in  their  problem 
formulation  unlike  this  contribution.  □ 


2.1  MMSE  Equalizer  with  Delay  d 


We  wish  to  design  an  MMSE  (minimum  mean-square  error)  linear  equalizer  of  a  specified  length. 
It  is  not  too  hard  to  establish  (using  the  orthogonality  principle  [16],  for  example)  that  the  MMSE 
equalizer  of  length  ie  +  1  to  estimate  w{k  -d)(d>  0)  based  upon  y{n),  n  =  k,k- 1,- •  •  ,k  -  L 
satisfies 


G(i,o 


Gd 


l^Lfe 


Ff  0 


0 


(2-3) 


where  HyyLe  is  a  [N{Le+l)]  x  [iV(Le+l)]  matrix  with  its  jj-th  block-element  given  by  Kyy{j-i)  : 
E{y{k  +  j  -  i)y^{k)}.  The  equalized  output  is  given  by 


Le  _ 

w{k  —  d)  =  ^2  Gd,iy{k  -  0- 

t=0 


(2-4) 
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Clearly  one  can  obtain  a  consistent  estimate  of  'R-yyie  from  the  given  data.  It  remains  to  estimate 
Ft’s  to  complete  the  design.  This  is  where  the  multistep  predictor  approach  turns  out  to  be  useful. 


3  Partial  Channel  Identification  Using  Multistep  Predictors 


3.1  FIR  Multistep  Linear  Predictors 


By  (1-2)  and  (HI),  it  follows  that 

Tla 

s{k)  =  -'^ais{k-i)  +  '^Biw{k-i). 

t=l  1=0 

It  then  follows  from  (3-1)  that 


r  .1 

s{k)  =  -Y^ais{k  -  i)  -  ai  - ^ ajs(fc  -  1  -  i) ^ Biu;(fc  -  1  -  i)  +^Biw{k-i) 
i—2  -  i=l 

Tla+l  +  l 

=  -  y]  af  >s(fc  -  i)  +  E  -  i)-  (3-2) 

i=2  i=0 

for  some  appropriate  choices  of  the  parameters  a[  ^s  and  b|  ^s.  Now  substitute  for  s(fc  —  2)  using 
(3-1)  in  (3-3),  and  continuing  this  way,  we  have,  in  general,  for  appropriate  choices  of  a[  h  and 

Sf^s  {I  >  1) 

Tla+l-l  nfc+i-l 

s(i)=-  E  afh(k-i)+  E  B(')«.(fc-i).  (3-3) 

i=l  t=0 

Both  (3-1)  and  (3-3)  represent  the  same  signal/system  and  therefore,  they  must  have  the  same 
impulse  response.  By  (1-4),  (HI),  (3-1)  and  (3-3),  it  then  follows  that 

=  Ft  for  0  <  i  <  1  -  1. 

Let  us  rewrite  (3-3)  as 

s{k)  =  eik\k-l)+^k\k-l)  (3-5) 


where 


e{k\k  B^pw{k  -  i)  =  ^  Fiw{k  -  i) 
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and 


na+J-l  n(,+2-l 

s{k\k-l)-.=  -  Y,  »!V-i)+  E  (3-7) 

i=l  i=l 

We  first  need  some  notations  and  definitions. 

Notations  and  Definitions:  Consider  the  Hilbert  space  H  of  square  integrable  complex  ran¬ 
dom  variables  on  a  common  probability  space  endowed  with  the  inner  product  (for  scaler  complex 
random  variables  xi  and  X2)  <  xi,X2  >=  E{xix*2}  where  the  superscript  *  denotes  complex 
conjugation  (see  [15]).  Let  Sp{xi  6  1}  denote  the  subspace  of  H  generated  by  the  random  vari¬ 
ables/vectors  in  the  set  {ij  €  /}.  Let  IIk{s)  denote  the  subspace  generated  by  the  past  of  s  up  to 
time  k 

Hk{s):=  Sp{si{k-m),  i  =  1, 2, ■  •  iNT;  m  =  (3-8) 

and  let  Hk-i,L{s)  denote  the  subspace  spanned  by  a  finite  past  of  s 

Hk-i,L{s)-=Sp{si{k-m),  i  =  1,2,- ■  •,N-,  m  =  1,2,  •  •  •, X}.  (3-9) 

Let  {s{k)\Hk-iis))  denote  the  orthogonal  projection  of  s{k)  onto  the  subspace  iffc-i(s)  [15].  □ 

Theorem  1.  Under  (H1)-(H4)  and  for  /  =  1,2,  •  ■  {s{k)}  can  be  decomposed  as  in  (3-5)  such 

that 


E{e{k\k  -  l)s^{k  -  m)}  =  0  Vm  >  1, 

(3-10) 

s{k\k-l)  —  {s{k)\Hk-i{s)) , 

(3-11) 

S{k\k-l)  6  Sk-l,na+nb+l-l{^) 

(3-12) 

and 

s(fc|fc  —  1)  =  {s{k)\Hk-l,na+nt+l-l{^))  • 

(3-13) 

The  decomposition  (3-5)  is  unique.  • 

Proof:  By  (1-2),  (1-4),  (HI)  and  (H3),  we  have 

00 

s(^)  =  XI ^‘"“(^ 

t=0 

(3-14) 
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(3-15) 


By  (2-1),  (2-2)  and  (1-2),  it  follows  that 

Le 

^Gis(A:  — i)  =  w{k). 

»=o 

Substituting  for  w(k)  from  (3-15)  in  (3-7),  it  follows  that 

^k\k-l)  e  Hk-i{s).  (3-16) 

By  (3-14)  and  (H4),  we  have 

E {w{k)s^ (k  —  m)}  =  0  Vm>0.  (3-17) 

Therefore,  using  (3-6)  and  (3-17),  it  follows  that  (3-10)  is  true.  By  (3-5),  (3-10),  (3-16)  and 
the  orthogonal  projection  theorem  [15],  it  follows  that  (3-11)  is  true  (as  the  “error”  e(k\k  -  1)  is 
orthogonal  to  the  data  s(A:  —  m)  (tu  >  i),  hence  to  the  subspace 


It  remains  to  establish  (3-12)  and  (3-13).  Define 

x(A:)  :=  .4.(z)s(fc)  =  B{z)w{k).  (3-18) 

By  the  proof  of  Theorem  1  in  [14]  (and  with  obvious  changes  in  notation  of  [14]),  using  (3-18)  (i.e. 
x(fc)  =  B{z)w{k))  and  (H2),  we  have 

=  Hk,M+nbiw)  y  M  >nb-l.  (3-19) 

It  follows  from  (3-18)  (i.e.  x(A:)  =  A{z)s{k))  and  (HI)  that 

HkM^)  c  Hk,M+nM  VM>0.  (3-20) 

Therefore,  (3-19)  and  (3-20)  lead  to 

Hk,M+n,M  C  Hk,M+na{s)  V  M  >  Tlfc  -  1,  (3-21) 

and  in  general,  we  have  for  any  integer  I 

Hk-i,M+n,+iM  Hk-i,M+r^a+ii^)  V  M  >  Tifc  -  1.  (3-22) 

It  therefore  follows  from  (3-7)  and  (3-22)  that 

s(fc|fc  -  /)  G  Hk-i.na+M+lis)  V  M  >  nt  -  1.  (3-23) 


If  we  pick  M  =  nb  - I'm  (3-23),  we  obtain  (3-12).  Finally,  (3-13)  follows  from  (3-5),  (3-10),  (3-12) 
and  the  orthogonal  projection  theorem  [15]. 

Uniqueness  of  the  decomposition  (3-5)  is  a  consequence  of  the  orthogonal  projection  theorem 
[15].  Suppose  that  there  exists  some  other  decomposition 

s{k)  =  e(fc|fc  -  1)  +  s(A:|A:  -  /)  (3-24) 

such  that 

E{e{k\k  —  l)s^{k  —  m)}  =  0  Vm  >  I  (3-25) 

and 

%ik\k-l)  e  (3-26) 

Then  the  orthogonal  projection  theorem  [15]  implies  that 

^;{||s(A:|jfc-0-s(A;|A:-0in  =  0  (3-27) 

and 

E{\\e{k\k-l)-e{k\k-l)f}  =  0.  (3-28) 

This  completes  the  proof  of  Theorem  1.  □ 

Remark  2:  When  A{z)  =  1  (i.e.  the  channel  is  FIR),  the  results  of  Theorem  1  hold  true  with 
na  =  0.  Note  that  we  may  write  A{z)  =  X)r=o  with  oq  1.  □ 

It  follows  from  Theorem  1  that 

Li 

s(A:|fc  -0  =  5^  A.^ps{k  -  i)  where  Li  >  Ua  +  Ub  +  I  -  1,  (3-29) 

i=i 

for  some  NxN  matrices  A?^s.  By  (3-5)  and  (3-10)  (recaU  also  the  orthogonal  projection  theorem), 
we  have 

s{k\k-l)  =  arg|minx(fc)eHfc_,(s)-E{l|s(fc)-x(A:)ll^}.  (3-30) 

Therefore,  s{k\k  -  1)  is  the  Z-step  (ahead)  linear  predictor  of  s{k)  given  {s(m),  m<k-l}.  By 
(3-13)  it  is  also  the  Z-step  (ahead)  linear  predictor  of  s{k)  given  {s(m),  k  -  Li  <  m  <  k  -  1}. 
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(3-31) 


Using  (3-5)  and  (3-29)  we  have 

Li 

s{k)  =  Af ^s(fc  -  i)  -h  e(A:|A:  -  1). 

i=l 

By  (3-. 10)  and  (3-31),  for  m>l, 

E{s{k)s«ik  -  m)}  =  f]  Af^E{s{k  -  i)s^{k  -  m)}.  (3-32) 

i=i 

By  the  orthogonal  projection  theorem  and  (3-13),  it  is  sufficient  to  consider  (3-32)  form  =  /,l-f 
1,  -  •  • ,  ij  in  order  to  solve  for  A^'^s.  Using  these  values  of  m  in  (3-32)  we  may  write 

a\%  •••  Ag  ]  =  [r»(0  R«(i  +  1)  •••  R»(i0]  (3-33) 

where  IZ^sM  denotes  a  [N{M  +  1)]  x  [N{M  +  1)]  matrix  with  its  ij-th  block  element  as  R„(i  -  i) 
=  E{s{k  +  j-  i)s^{k)}.  Note  that  n,,i^Li-l)  is  not  necessarily  full  rank,  therefore,  the  coefficients 
Af^s  are  not  necessarily  unique.  (Note  that  the  orthogonal  projection  theorem  implies  uniqueness 
in  the  sense  of  (3-27),  it  does  not  necessarily  imply  the  uniqueness  of  the  coefficients  in  (3-31).)  A 
minimum  norm  solution  to  (3-33)  may  be  obtained  as  [17] 

A«  A«  Ag]  =  [r„(1)  R„((+I)  ■■■  R„(i.)  ] (MU 

where  the  superscript  ^  denotes  the  pseudoinverse. 


3.2  Estimation  of  Noise  Variance 


Consider  the  case  of  /  =  1  (one-step  prediction).  By  (3-6)  and  (3-31)  we  have 

s{k)  =  ^A(^^s(A:-z)-l-Foin(A:).  (3-35) 

i=l 

If  Xi  >  ria  -h  rib,  then  =  0  for  i  >  tIq  -h  nj,  by  virtue  of  (3-13). 

Lemma  1.  Under  (H1)-(H4),  /9(7^„LJ  <  ALi  -f- 1  for  Li  >na  +  nb  where  p{A)  denotes  the  rank 
of  A.  • 

Proof:  It  follows  from  (3-35)  that 
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Clearly 


Using  (3-36)-(3-38)  and  Sylvester’s  inequality  [7,  p.  655],  it  follows  that 


pOZssL^)  +  N-N{Li  +  1)<1 


(3-37) 


(3-38) 


(3-39) 


which  yields  the  desired  result.  □ 

In  a  fashion  similar  to  in  (3-33),  let  UyyM  denote  a  [N{M  -h  1)]  X  [N{M  + 1)]  matrix  with 
its  ij-th  block  element  as  Kyy{j  -i)  =  E{y{k  +  j  -  i)y^{k)}-,  define  similarly  7^„„M  pertaining  to 
the  additive  noise.  Carry  out  an  eigendecomposition  of  TtyyL^ .  Then  the  smallest  N  —  1  eigenvalues 
oilZyyLy  cqual  cr^  because  under  (H1)-(H4),  p{TZs,Li)  <  JVii-l-l  whereas  under  (H5),  p{TZnnLi)  = 
NLi  +  N  =  /)(7^yJ,Ll)•  Thus  a  consistent  estimate  of  al  is  obtained  by  taking  it  as  the  average 
of  the  smallest  JV  -  1  eigenvalues  of  UyyL^ ,  the  data-based  consistent  estimate  of  TlyyLy  • 

We  will  need  the  estimate  of  noise  variance  later  to  calculate  72.^  (z,j-0  (3-34)  for  various 

f  >  1.  By  (3-29),  Li-l>na  +  nb-l,  independent  of  1.  This  suggests  that  we  keep 

Li  —  I  =  L  >  Ua  +  Ub  —  1  (V  /)•  (3-40) 


Then  under  (H4)  and  (H5), 

^ssL  —  ^yyL~^nnL  ~  ^yyL  ~  ‘^n^(L+l)x(E+iy  (3  41) 

Thus,  can  be  estimated  from  noisy  data. 

3.3  Partial  Channel  Identification 

Recall  that  our  main  objective  is  to  implement  the  MMSE  linear  equalizer  with  delay  d  as  specified 
by  (2-3)  and  (2-4).  To  this  end,  we  need  estimates  of  Fj  for  i  =  0, 1,  •  •  • , d.  We  now  show  how  (3-6) 
and  (3-31)  may  be  used  for  this  purpose. 
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Let 


Jj  zz:  £/  -{-  d  1l 


(3-42) 


where  L  is  a.s  in  (3-40).  Rewrite  (3-31)  as 

L 

&{k\k-l)  = 

i~0 

where 


(3-43) 


aS'>  =  ■! 


(3-44) 


InxN  for  z  =  0 

0  for  1  <  z  <  /  -  1 
— for  I  <  i  <  L  +  I 
0  for  L  +  l  +  l<i<L- 

By  (3-40)  Li  —  L  +  1,  therefore,  for  each  I,  we  estimate  L  +  1  coefficients  in  (3-34).  For  f  >  2,  define 
et(A:)  :=  e(fclfc  - /)  -  e(fc|A:  -  Z -f  1) 

L 


t=0 


where 


:=  z  =  0,l,---,i. 


By  (3-44),  =  0  VZ  >  2. 

Consider  the  [N{d+  l)]-vector 


E(fc)  := 

Using  (3-43)-(3-47)  we  have 
E(fc)  =  VS{k) 
where 


(3-45) 


(3-46) 


'  e^+r{k  +  d)  :  ej+i(fc  +  d  -  1)  i  •  •  •  i  ^^(fc  +  1)  i  e'^{k\k  -  1)  ]  .  (3-47) 


(3-48) 


S{k) 


s^{k  +  d  —  1) :  s^(fc  -t-  d  —  2) :  •  •  •  :  s^(fc  —  L) 


(3-49) 
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is  a  +  d)]— column  vector  and 


L 

0  . 

0 

0 

0 

L-1 

D^d-i)  . 

Lt 

0 

0 

V  := 

0 

0 

0 

■ . .  Df’*'  ■ 

0 

... 

0 

(.3-50) 

0 

0 

0 

A<'*  . 

..  aW 

L-1 

a(3) 

L  . 

is  a  [JV(d-f- 1)]  X  [N{L+d)]  matrix. 

In  (3-50)  we 

have  used  the  fact  that  Dq 

> 

o 

II 

>  2.  Using 

(3-6),  (3-45)  and  (3-47),  we  have 


E(fc) 


Fd 

Fd-i 


w{k)  =:  Fw{k). 


(3-51) 


By  (3-48),  (3-51)  and  (H4),  it  follows  that 

R£;£;(0)  =  E{Eik)F,^ik)}  =  FF^ 


~  ^^5s(L+d-l) 


V^. 


(3-52) 

(3-53) 


Clearly  p(FF^)  =  1.  This  suggests  a  method  to  estimate  F.  Calculate  Rbk(O)  as 

Ri,j;(0)  =  V  I’"- 

Carry  out  an  eigendecomposition  of  R£;jb(0)-  The  nonzero  eigenvalue  of  Rf;£;(0)  is  equal  to  A  — 
|1F||^.  Let  the  corresponding  unit-norm  eigenvector  be  denoted  by  Q\.  Then 

F  =  a\f\Qx 

for  some  a  such  that  ja]  =  1.  (We  have  the  equaUty  FF^  =  \Q\Qx  but  not  necessarily  (3-55)  with 
In  practice  when  the  true  values  in  (3-54)  are  replaced  with  their  data-based  (consistent) 
estimates,  p(Ree{0))  >  1  where  Rbb(O)  denotes  the  estimate  of  R£;£;(0).  In  this  case  we  pick  the 
largest  eigenvalue  as  A  and  the  corresponding  unit-norm  eigenvector  as  Q  a  in  order  to  implement 
(3-55).  More  details  are  provided  in  Sec.  3.4. 
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3.4  Practical  Implementation 


Given  data  y{k),  Jfc  =  1, 2,  •  •  • ,  T.  Pick  the  length  +  1  and  delay  d  of  the  MMSE  equalizer.  (By 
(2-1)  and  (2-2)  should  satisfy  L^>na  +  nb-l.  )  Let  I  =  Le  in  (3-40).  The  following  steps  are 
executed  to  implement  a  practical  algorithm  based  on  tlic  earlier  discussion  in  Secs.  2  and  3.1--.J.3. 


3.4.1  ALGORITHM  I:  Multistep  Linear  Predictors-Based  Blind  equalizer  -  MSLP 

I.l  Estimate  the  correlation  function  of  the  measurements  at  lag  m  as 

y(*  + 

^  A:=l 

where  we  take  y(Jfc  -b  m)  =  0  if  fc  +  m  <  1  or  >  T.  Define  the  [iV(I-l-  1)]  X  [N(L  -b  1)]  matrix 
with  its  ij-th  block  element  as  RT;y(y  -  i).  Carry  out  an  eigendecomposition  of  Tlyyi- 
Let  NL  +  2,---,NL  +  N)  denote  the  smallest  N  -I  eigenvalues  of  Tlyyj;-  Estimate 

the  noise  variance  as 


1 


N  -I 


NL+N 

E 


i=NL-{-2 


The  signal  correlation  function  at  lag  m  is  then  estimated  as 


(3-57) 


Rss(n^)  = 

where  S{m)  is  the  Kronecker  delta  function.  Define  the  [N{L  -b  1)]  X  [N{L  -b  1)]  signal 
correlation  matrix  estimate  with  its  zj— th  block  element  as  Rss(i  —  0-  need 


(3-30)  for  m  =  0, 1,  •  •  • ,  L  -b  2d  (see  (3-42)  and  (3-54)). 

1.2  Now  we  implement  (3-34).  First  we  need  to  calculate  Carry  out  a  singular  value 

decomposition  of  leading  to  =  USV^  where  S  =  diag{si,  i  =  1, 2,  •  •  • ,  iVl -b  A}. 


The  rank  ni  of  TZ^^i  is  determined  as  the  smallest  n  for  which 


^^NL+N  .. 

<ei 

^NL+N 
2^i=l  *» 


(3-59) 


14 


where  ei  >  0  is  a  small  number.  [For  simulations  presented  in  Sec.  5  we  took  ei  =  0.01].  The 
desired  pseudoinverse  is  then  calculated  as 


n*-  =  ViS]-^Uf  (3-60) 

ssL  ^ 

where  Si  =  diag{si,i  =  1,2,  •  •  •,7^l}  and  Ui  and  Vi  are  comprised  of  the  left  and  the  right 
(respectively)  singular  vectors  corresponding  to  the  singular  values  retained  in  Si. 


Estimate  for  i  =  /,  Z  +  1,  •  •  • ,  i/,  {Li  ^  L  +  1)  via  (3-34)  for  Z  —  1, 2,  •  •  • ,  d  +  1  . 


A<'l  aS2 


J+1 


L+l 


R,,(0  +  ••• 


n* 

ssL 


(3-61) 


Following  (3-44)  set 

InxN  for  i  =  0 

0  for  1  <  i  <  Z  —  1 
— for  I  <  i  <  L  +  I 
0  for  + 

1.3  Following  (3-46)  set 


-  A 


Mi-i) 


i  =  0,l,--‘,ii-t-<Z-|- 1; 


Z  =  2, 3,  •  •  • ,  <Z  -|- 1 5 


(3-62) 


(3-63) 


and 


V  =  as  in  (3-50)  with  replaced  by  . 


(3-64) 


Define 


Rbb(O)  =  ^^ssiL+2d)'^''- 

Carry  out  an  eigendecomposition  of  R£;£;(0)  to  calculate  its  largest  eigenvalue  A  and  the 
corresponding  unit-norm  eigenvector  Q\.  This  yields  the  partial  channel  estimate  up  to  a 
scale  factor  (recall  (3-55))  as 
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1.4  Following  (2-3),  the  MMSE  equalizer  with  length  Le  and  delay  d  is  calculated  as 


^  -^--1 
Gdfl  Gd,!  •  •  •  Gd,Lt  ~  ^  ^yyl'e ' 

Finally  the  equalized  signal  (up  to  a  scale  factor)  is  given  by 

r..  ^ 

w(k-d)  =  Y^GdM^^-^)- 

t=0 


(3-67) 


(3-68) 


3.4.2  ALGORITHM  II:  Linear  Predictor-Based  Blind  equalizer  -  LP 

Here  we  will  use  (2-3)  with  Fi  (i  =  0, 1, •  • -.d)  estimated  using  the  basic  approach  of  [9]  and 
[14]  which  utilizes  one-step  ahead  linear  predictors  (/  =  1).  Although  [9]  and  [14]  derive  all  their 
results  under  the  assumption  of  FIR  channels  with  no  common  zeros,  their  results  extend  (with 
straightforward  modifications)  to  models  that  satisfy  (H1)-(H5)  by  virtue  of  Theorem  1.  By  (3-6) 
and  (3-35),  we  have 

wik)  =  |iFo|r'F^e(A:|fc  -  1).  (3-69) 

From  (3-14)  and  (H4),  we  have  the  relations 

E{w{k  -  l)s^{k)}  =  Ff  for  I  >  0.  (3-70) 

From  (3-35)  and  (3-69),  we  have  the  relations 

E{w{k-l)s^{k)}  =  IlFoir^F^  R,,(-0-l- 

From  (3-71)  and  (3-72)  it  follows  that 

FP  =  IlFoil-'F^  Rf,(0-l-EA('^Rf,(l  +  z) 

L 

Based  upon  the  above  discussion,  [9]  and  [14],  we  have  the  following  algorithm: 


(3-72) 


|:A(%„(-/-i)  .  (3-71) 

»=i 


II.  1  Repeat  step  I.l  of  Algorithm  I. 


II.2  Execute  step  1.2  of  Algorithm  I  only  for  I  =  1.  Calculate 


X-i 


Ree(O)  :=  R„(0)  -  5^Af^R„(-z). 


(3-73) 


t=l 


Carry  out  an  eigendecomposition  of  Ree(O)  to  calculate  its  largest  eigenvalue  Aq  and  the 
corresponding  unit-norm  eigenvector  Qxo-  This  yields  the  estimate  (up  to  a  scale  factor)  of 
Fo  as 


sf^oQxo- 


(3-74) 


11.3  Estimate  up  to  a  scale  factor  as 


Ff  =  [Ao]-^F^ 


Li 


RS(0  +  EAf’R2('  +  i) 


1=1 


(3-75) 


II. 4  The  MMSE  equalizer  of  length  Zr^  +  1  with  delay  d  is  calculated  (up  to  a  scale  factor)  as 


Gdfi  ‘  *  Gd,Le 


ff  Ff-i  F?  0 


(3-76) 


Finally,  execute  (3-68). 


4  Blind  Equalization:  Common  Zeros 

Now  we  allow  common  subchannel  zeros.  In  this  case  since  ideally  we  need  infinite  length  inverses 
and  linear  predoctors,  the  presented  results  hold  true  only  approximately  for  finite  length  equalizers. 
Assume  that  (H1)-(H5)  hold  true. 

4.1  Minimum-Phase  Zeros 

Here  the  SIMO  transfer  function  is 

^  ^  A{z) 

where  satisfies  (HI),  B{z)  satisfies  (H2)  and  Bc{z)  is  a  finite-degree  scalar  polynomial  that 

collects  all  the  common  zeros  of  the  subchannels.  Assume  that 
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(H6)  Given  model  (4-1),  Bc{z)  ^  0  for  \z\  >  1. 


Then  while  ^i(z)  =  A~'^{z)B{z)  has  a  finite  inverse,  B-^(z)  is  HR  though  causal  under  (H6). 


By  (H6)  there  exists  a  unique  scalar  polynomial  Qdz)  such  that 

OO 

gdz)Bc{z)  =  1  where  Gdz)  =  '^giz~^  (4-2) 

i=0 


and 


i=0 


Indeed,  for  some  0  <  Oi  <  oo  and  0  <  <  1,  we  have 


|5il  <  aiM"  Vi. 

Using  (4-l)-(4-4),  (2-1)  and  (2-2),  it  follows  that  there  exists  a  1  X  i\r  polynomial  g'{z)  such  that 
for  some  0  <  a2  <  oo  and  0  <  f32  < 

g\z)  =  f^G\z-\  (4-5) 

i=0 


(4-3) 


(4-4) 


1G;11  <  Vi 


(4-6) 


and 


g'{z)j^{z)  =  1. 

By  (1-4),  (HI)  and  (H3),  for  some  0  <  as  <  oo  and  0  <  /Js  <  1>  we  have 
llFill  <  azPf  Vi. 


Consider 


s(A:|fc-0  Y^Yiw{k-i). 
i=l 


By  (4-5)-(4-7)  and  (1-2),  it  follows  that 


•^(^)  =  5DG-s(fc-i). 

1=0 


(4-9) 


(4-10) 
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Substituting  for  w{k)  from  (4-10)  in  (4-9),  it  follows  that 


s{k\k  -  0  =  E  Fi  (  f;  G'^s{k  -  i  -  m)] 

i=zl  \m=0  / 

(4-11) 

oo 

=  E  Cns(A:  -  n) 

n=l 

(4-12) 

where  (recall  that  =  0  for  i  <  0) 

oo  Tl 

C„=  ■£  =  E 

m=0  m—0 

(4-13) 

It  follows  from  (4-6),  (4-8)  and  (4-13)  that  for  some  0  <  04  <  oo  and  0  <  ^4  <  1, 

we  have 

IIQll  <  04/3^  Vi. 

(4-14) 

We  now  rewrite  (3-14)  as 

s(fc)  =  e(fc|A:  -  Z)  +  s(^lfc  —  0 

(4-15) 

where  eik\k  -  1)  is  as  in  (3-6)  ,but  s(fc|fc  -  1)  is  given  by  (4-12).  We  have  the  Mowing  result. 

Theorem  2.  Under  (H1)-(H4)  with  J^{z)  in  (HI)  obeying  (4-1),  (H6)  and  for  1 
can  be  decomposed  as  in  (4-15)  such  that 

=  1,2,---,  {s{k)} 

E{e{k\k- l)s^{k  -  m)}  -  0  Vm  >  Z, 

(4-16) 

and 

s{k\k-l)  -  {s{k)\Hk-i{s)) . 

(4-17) 

Furthermore,  let 

s{k\k  -  l,k-  M)  {s(k)\Hk-i,M{s)) 

(4-18) 

and 

e(fc|fc  —  l,k  —  M)  ;=  s(fc)  —  s(/i:|fc  —  Z,  fc  —  M). 

(4-19) 

Then 

]imM^ocE{\\B{k\k-l)-B{k\k-l,k-M)f}  =  0  . 

(4-20) 
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Proof:  Eqns.  (4-16)  and  (4-17)  follow  as  in  Theorem  1.  We  now  turn  to  (4-20).  It  follows  from 
(4-17),  (4-18)  and  [15,  Chapter  1,  Lemma  S.l.b]  that 

limAf_^oo^^{||s(A:|A:-0-s(A;lfc-/,fc-M)||®}  =  0.  (4-21) 

Then  (4-20)  is  immediate  via  (4-15)  and  (4-19).  We  will  provide  an  alternative,  interesting  proof. 
Consider 

E{\\e{k\k  -  0  -  e(ifc|A:  -l,k-  M)f}  =  E{\\e{k\k  -  l)f}  +  E{\\e{k\k  -l,k-  M)f} 

—E{e^{k\k  —  /)e(fc|fc  —  l,k  —  M)}  —  E{e^{k\k  —  l,k  —  M)e{k\k  —  /)}.  (4'22) 

Using  (4-15),  (4-17)-(4-19)  and  the  orthogonality  principle,  it  follows  that 

E{e^{k\k  -  l)e{k\k  -  l,k  -  M)}  =  E{e^{k\k  -  l)s(k)}  =  i?{l|e(A:|fc  -  Z)|p}.  (4-23) 

Hence  by  (4-22)  and  (4-23),  we  have 

S{||e(fclA:  -  Z)  -  e(A:|fc  -l,k-  M)f}  =  ^;{||e(A:|fc  -l,k-  M)f}  -  E{\\e{k\k  -  l)\\^}.  (4-24) 
Define 

M 

SMm  -  0  :=  E  Cns(fc  -  n)  (4-25) 

n—l 

where  CnS  satisfy  (4-13).  Then 

E{||e(Jklfc-Z,A:- M)l|^}  <  J5{||s(fc)  -  SM(A:|fc  -  Z)||^} 

=  .E'{|ls(A:)  —  s(fcjA:  —  Z)|p}  4-  E'{|ls(fc|fc  —  Z)  —  SAf(fc|fc  —  Z)|l  } 

=  E{||e(A:|A:- Z)f}  -b  iB{||s(fclfc  -  Z)  -  SAf(A:lfc  -  Z)||^}  (4-26) 

where  we  have  used  the  facts  that  (4-18)  holds  true,  SM(.k\k  -  Z)  G  iffc-/,M(s)  and  e{k\k  -  Z)  = 
s(A:)  -  s(fclfe  -  Z)  is  orthogonal  to  Hk-i{s).  By  (4-24)  and  (4-26)  we  have 

0  <  E{\\e{k\k-l)-e{k\k-l,k- M)f}  <  E{lls(A:lfc  -  Z)  -  SM(fclfc  -  Z)f }.  (4-27) 

By  (4-12)  and  (4-25),  it  follows  that 

s{k\k  —  Z)  —  SMik\k  —  1)  —  ^  Cns{k  —  n).  (4  28) 

n=M+l 
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It  then  follows  from  (3-14),  (4-14)  and  (4-28)  that 


(4-29) 


The  desired  result  (4-20)  then  follows  from  (4-27)  and  (4-29).  □ 


Theorem  2  clearly  implies  that  for  M  ‘large  enough’  in  (4-18),  we  can  obtain  e{k\k  -  l,k-  M) 
close  enough  to  e(jfc|fc  -  1).  Therefore,  the  approach  of  Sec.  3  becomes  applicable  to  the  current 
case.  Note  that  validity  of  (2-3)  and  (2-4)  is  unaffected  by  (H6).  For  a  fixed  d  and  in  (2-3)  and 
(2-4),  one  needs  to  estimate  Fj  for  z  =  0, 1,  •  •  • ,  d.  To  this  end,  one  should  pick  a  ‘large  L  in  (3-40) 
and  (3-42),  and  unlike  Sec.  3.4,  it  need  not  be  equal  to  Te,  rather  L  >  L^. 


Remark  3:  The  alternative  proof  of  (4-20)  given  above  may  be  used  to  obtain  an  upper 
bound  on  the  approximation  error  in  (4-20)  for  finite  M.  By  (4-27)  and  (4-28)  we  have 


E{\\e{k\k-l)-e{k\k-l,k- M)f}  <  tr 


OO 


E  E 

n—M+1  m=M+l 


C„R„(m  -  7i)C 


(4-30) 


4.2  Arbitrary  Zeros 

In  this  case  (4-1)  is  true  but  Bc{z)  does  not  necessarily  satisfy  (H6).  We  may  rewrite  (4-1)  as 


where  J^ap(-z)  is  an  allpass  (rational)  function  such  that 
Bo{z)Bc{z-^)  =  f^AP{z)BMp{z) 

and  Bmp{^)  is  minimum-phase.  Thus  (within  a  scale  factor)  we  have 

J(U  =  5^0(2). 

A(^z) 

We  may  rewrite  (1-2)  as 

y(fc)  =  T{z)w'{k)  +  n(fc) 


where 

w'{k)  :=  J^Ap{z)w{k). 


(4-35) 
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Clearly  w'{k)  satisfies  (H4).  Hence,  (4-33)-(4-35)  satisfy  the  requirements  of  Sec.  4.1.  Therefore, 
one  can  “approximately”  recover  w\k)  from  the  given  data  by  applying  the  algorithm  of  Sec.  3.4. 

In  order  to  recover  w[k)  form  w\k),  one  needs  to  exploit  the  higher-order  statistics  of  {m  (fc)}i 
see  [2], [3]  and  references  therein. 

5  Simulation  Examples 

Here  we  consider  three  simulation  examples  to  illustrate  the  proposed  approaches.  The  first  two 
examples  are  modified  versions  of  the  example  from  [10].  Example  1  consists  of  an  ARM  A  model 
whose  MA  part  is  taken  from  [10].  Example  2  consists  of  an  MA  (FIR)  model  where  we  augment 
the  FIR  channel  of  [10]  with  a  zero  at  0.5  where  this  zero  is  common  to  aU  of  the  three  subchannels. 
Finally,  the  channel  model  in  Example  3  is  exactly  as  in  [10]. 

For  computing  in  (3-60)  via  SVD,  we  picked  ei  =  0.01  in  (3-59).  Moreover,  “RyyL,  in  (3- 

67)  and  (3-76)  was  also  computed  using  SVD  where  all  singular  values  smaller  than  0.002 x  (largest 
singular  value)  were  neglected.  Thus,  calculation  of  was  regularized.  The  measurement  SNR 

is  defined  as 
Sne.  _ 

~  E{\nm^y 

The  normalized  MSE  (i.e.,  MSE  divided  by  E{\wik)\‘^})  and  the  probability  of  detection  error 
(Pe)  after  equalization  were  taken  as  the  two  performance  measures  after  averaging  over  100  Monte 
Carlo  runs.  The  equalized  data  were  rotated  and  scaled  before  calculating  the  two  performance 
measures.  After  designing  the  equalizers  based  on  the  given  data  record,  the  designed  equalizer 
was  applied  to  an  independent  record  of  length  3000  symbols  in  order  to  calculate  normalized  MSE 
and  Pe-  Therefore,  the  estimated  Pe  is  not  reliable  below  approximately  10  hence,  these  values 
are  not  shown  in  Figs.  1-3. 

5.1  Example  1:  HR  Channel  With  No  Common  Zero 

We  have  A  =  3  in  (1-2)  with  J^{z)  =  A-^{z)B{z)  where 
A{z)  =  (1  -  0.5z-^)J3x3 
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and  B{z)  is  3  x  1  MA(6)  obtained  from  [10]  as  follows.  Consider  a  raised  cosine  pulse  P6(<,0.1) 
with  a  roll-off  factor  0.1,  truncated  to  a  length  of  6T,  [T,  =  symbol  duration).  As  in  [10],  a  two-ray 
multipath  channel  with  (effective)  impulse  response 

h{t)  =  P6{t,0A)  -  O.lpeit  -  r,/3, 0.1)  (5-2) 

was  sampled  at  intervals  of  T,/3  (starting  at  t  =  -3T,)  to  create  the  B(z)  above.  Transfer  function 
B{z)  satisfies  (H2)  [10],  therefore,  there  exists  a  finite  left  inverse  of  length  =  Q  (cf.  Sec.  2). 

The  scalar  input  w{k)  is  4-QAM.  An  MMSE  equalizer  of  length  !«  =  8  (9  taps  per  subchannel, 
totaling  27  taps  —  overfitting)  was  designed  with  a  delay  d  =3  (arbitrarily  selected  just  for 
iUustration).  The  Algorithms  I  (MSLP)  and  II  (LP)  were  applied  for  record  lengths  T  =  250  and 
500  symbols  with  varying  SNR’s.  In  order  to  apply  MSLP,  we  picked  L  -  Lg  8  {>  Ua  +  rib  -  1  = 

1  -h  6  -  1  =  6)  in  (3-40).  We  picked  Li  =  ie  =  8  for  LP  in  (3-73)  and  (3-75).  Fig.  1  shows  the 
normalized  MSE  and  the  probability  of  error  Pg.  It  is  seen  that  the  proposed  design  approach  can 
handle  HR  channels  with  little  difficulty.  Algorithm  I  (MSLP)  performs  the  best. 

5.2  Example  2:  FIR  Channel  With  Common  Zero 

Again  we  have  A  =  3  in  (1-2)  but  with  P(z)  =  Bc{z)B{z)  where  B{z)  is  as  in  Example  1  and  Bc{z) 
is  a  scalar  polynomial  given  by 

Bc{z)  =  l-0.5z-^. 

Thus  all  three  subchannels  have  a  common  zero  at  0.5.  The  input  w{k)  is  4-QAM  as  in  Example 
1.  Note  that  in  this  example  a  finite  left  inverse  and  finite-length  multistep  predictors  do  not  exist. 
First,  as  in  Example  1,  an  MMSE  equalizer  of  length  1^  =  8  was  designed  with  a  delay  d  =3.  The 
various  design  parameters  for  MSLP  and  LP  (I  and  Li)  were  as  in  Example  1.  Fig.  2  shows  the 
normalized  MSE  and  Pg.  We  also  tried  a  longer  equalizer  with  I-g  =  12  ^nd  d  =  3.  Furthermore, 
we  picked  I  =  Lg  =  12  for  MSLP  and  Li  =  Le  =  12  for  LP.  The  normalized  MSE  and  Pg  for  this 
choice  is  shown  in  Fig.  3.  It  is  seen  from  Figs.  2  and  3  that  the  proposed  design  approach  can  handle 
subchannels  with  common  minimum-phase  zeros  with  little  difficulty,  and  that  the  approaches  are 
not  unduly  sensitive  to  the  choice  of  the  various  parameters  involved.  As  m  Example  1,  Algorithm 

I  performs  the  best. 
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5.3  Example  3:  FIR  Channel  With  No  Common  Zeros 

This  channel  is  exactly  as  in  [10]  with  JV  =  3  in  (1-2)  and  J^(z)  =  B(z)  where  B(z)  is  as  in  Example 
1.  As  in  Example  1,  an  MMSE  equalizer  of  length  Le  =  8  was  designed  with  a  delay  d  =3  and  the 
design  parameters  for  MSLP  and  LP  were  kept  unchanged.  Fig.  4  shows  tlie  iioiinalized  MSE  and 
Pe-  As  in  the  earlier  examples,  MSLP  outperforms  LP. 

6  Conclusions 

Direct  blind  MMSE  equalization  of  SIMO  channels  using  only  the  second-order  statistics  of  the  data 
and  the  multistep  linear  prediction  approach  was  considered.  Such  channels  arise  when  antenna 
arrays  are  used  or  when  signals  with  excess  bandwidth  are  fractionally  sampled  or  when  both 
these  scenarios  are  applicable.  Unlike  the  past  work  on  this  problem  [4],[5],[8]-[14],  the  proposed 
solution  is  applicable  to  HR  channels  and  to  SIMO  systems  having  common  zeros  among  the  various 
subchannels  so  long  as  the  common  zeros  are  minimum-phase.  In  case  of  nonminimum-phase  zeros, 
we  recover  an  allpass  filtered  version  of  the  original  input. 

Computer  simulation  examples  show  that  the  multistep  linear  predictors-based  MMSE  equalizer 
outperforms  the  one-step  linear  predictor-based  MMSE  equalizer  by  a  wide  margin. 

Future  work  includes  performance  analysis,  adaptive  implementation  and  extension  to  MIMO 
scenarios  involving  more  than  one  information  signals. 

7  References 

[1]  G.B.  Giannakis  and  S.D.  Halford,  “Bhnd  fractionally-spaced  equalization  of  noisy  FIR  chan¬ 
nels:  direct  and  adaptive  solutions,”  IEEE  Trans.  Signal  Processing,  vol.  SP-45,  pp.  2277- 
2292,  Sept.  1997. 

[2]  J.K.  Tugnalt,  “Blind  spatio-temporal  equalization  and  impulse  response  estimation  for  MIMO 
channels  using  a  Godard  cost  function,”  IEEE  Trans.  Signal  Processing,  vol.  SP-45,  pp.  268- 
271,  Jan.  1997. 


24 


[3]  Special  Issue,  IEEE  Transactions  on  Signal  Processing,  vol.  SP-45,  Jan.  1997. 

[4]  L.  Tong,  G.  Xu  and  T.  Kailath,  “A  new  approach  to  blind  identification  and  equalization  of 
multipath  channels,”  IEEE  Trans.  Information  Theory,  vol.  IT-40,  pp.  340-349,  March  1994. 

[5]  E.  Moulines,  P.  Duhamel,  J.  Cardoso  and  S.  Mayrargue,  “Subspace  methods  for  blind  identi¬ 
fication  of  multichannel  FIR  filters,”  IEEE  Trans.  Signal  Proc.,  vol.  SP-43,  pp.  516-525,  Feb. 

1995. 

[6]  J.K.  Tugnait,  “FIR  inverses  to  MIMO  rational  transfer  functions  with  application  to  blind 
equalization,”  in  Proc.  30th  Annual  Asilomar  Conf  Signals  Systems  Computers,  pp.  295-299, 
Pacific  Grove,  CA,  Nov.  1996. 

[7]  T.  Kailath,  Linear  Systems.  Englewood  Cliffs,  NJ:  Prentice-Hall,  1980. 

[8]  D.  Slock,  “Blind  fractionally-spaced  equalization,  perfect  reconstruction  filter  banks  and  mul¬ 
tichannel  linear  prediction,”  in  Proc.  1994  IEEE  ICASSP ,  pp.  IV:585-588,  Adelaide,  Aus¬ 
tralia,  May  1994. 

[9]  K.  Abed-Meraim  et  al.  “Prediction  error  methods  for  time-domain  blind  identification  of 
multichannel  FIR  filters,”  in  Proc.  1995  IEEE  ICASSP,  pp.  1968-1971,  Detroit,  MI,  May 
9-12,  1995. 

[10]  Z.  Ding,  “Matrix  outer-product  decomposition  method  for  blind  multiple  channel  identifica¬ 
tion,”  IEEE  Trans.  Signal  Processing,  vol.  SP-45,  pp.  3053-3061,  Dec.  1997. 

[11]  K.  Abed-Meraim  et  al.  “On  subspace  methods  for  blind  identification  of  single-input  multiple- 
output  FIR  systems,”  IEEE  Trans.  Signal  Processing,  vol.  SP-45,  pp.  42-55,  Jan.  1997. 

[12]  D.  Gesbert  and  P.  Duhamel,  “Robust  blind  channel  identification  and  equalization  based  on 
multi-step  predictors,”  in  Proc.  1997  ICASSP ,  pp.  3621-3624,  April  21-24,  1997. 

[13]  D.  Gesbert,  P.  Duhamel  and  S.  Mayrargue,  “Blind  multichannel  adaptive  MMSE  equalization 
with  controlled  delay,”  in  Proc.  Eighth  IEEE  Signal  Processing  Workshop  on  Statistical  Signal 
and  Array  Processing,  pp.  172-175,  Corfu,  Greece,  June  24-26,  1996. 


25 


[14]  K.  Abed-Meraim  et  al.  “Prediction  error  method  for  second-order  blind  identification,”  IEEE 
Trans.  Signal  Processing.,  vol.  SP-45,  pp.  694-705,  March  1997. 

[15]  P.E.  Caines,  Linear  Stochastic  Systems.  Wiley:  New  York,  1988. 

[16]  H.V.  Poor,  An  Introduction  to  Signal  Detection  and  Estimation.  Springer- Verlag:  New  York, 
1988. 

[17]  D.G.  Luenberger,  Optimization  by  Vector  Space  Methods.  New  York:  John  Wiley,  1969. 

[18]  B.  Huang  and  J.K.  Tugnait,  “Blind  equalization  of  I.I.R.  single-input  multiple-output  channels 
with  common  zeros  using  second-order  statistics,”  in  Proc.  1998  IEEE  ICASSP,  Seattle,  WA, 
May  12-15,  1998. 


26 


Prob.  of  Error  WISE  (dB)  in  equalization 


EXAMPLE  1:  HR  channel,  9X3  taps 


Fig.  1 


Record  length  T=250 


T=500 


;nL  column;  recuiu  a.— - 

[gorithm  I  (MSLP).  Parameters.  L  Le 


Li  =  8. 


Prob.  of  Error  WISE  (dB)  in  equalization 


EXAMPLE  2:  FIR  channel  with  common  zero,  9X3  taps 


Record  length  T=250  T=500 


Example  2:  ProbabiUty  of  detection  error  and  normalized  MSE 

SNR’s,  averaged  over  100  Monte  Carlo  runs.  Left  column;  record  lerigth  T  y  , 

riglit  column:  record  length  T=500  symbols.  Solid  lines:  Algorithm  II  (LP);  dotted  hnes. 
Algorithm  I  (MSLP).  Parameters;  L  =  Le  =  Li  =  S. 


Prob.  of  Error  MSE  (dB)  in  equalization 


EXAMPLE  2:  FIR  channel  with  common  zero,  13X3  taps 


Record  length  T=250  T=500 


SNR  (dB)  SNR  (dB) 

Fie.  3.  Example  2:  Probability  of  detection  error  and  normalized  MSE  after  equalization  for  various 
SNR’s,  averaged  over  100  Monte  Carlo  runs.  Left  column:  record  length  T=250  symbo  s, 
right  column:  record  length  T=500  symbols.  SoUd  lines:  Algorithm  II  (LP);  dotted  Unes: 
Algorithm  I  (MSLP).  Parameters:  L  =  Le  =  Li  =  12. 


Prob.  of  Error  MSE  (dB)  in  equalization 


EXAMPLE  3:  FIR  channel:  no  common  zeros;  9X3  taps 


Record  length  T=250  T=500 


4.  Example  3:  Probability  of  detection  error  and  normalized  MSE  after  equalization  for  various 
SNR’s,  averaged  over  100  Monte  Carlo  runs.  Left  column:  record  length  T=250  symbols, 
right  column:  record  length  T= 500  symbols.  Solid  lines:  Algorithm  II  (LP);  dotted  lines: 
Algorithm  I  (MSLP).  Parameters:  I.  =  ie  =  Li  =  8. 


Adaptive  Blind  Separation  of  Convolutive  Mixtures  of 
Independent  Linear  Signals^ 


Jitendra  K.  Tugnait 
Department  of  Electrical  Engineering 
Auburn  University,  Auburn,  AL  36849,  USA 
Tel.;  (334)844-1846  FAX:  (334)844-1809 

Email;  tugnait@eng.auburn.edu 


Abstract 

This  paper  is  concerned  with  the  problem  of  blind  separation  of  independent  signals  (sources) 
from  their  linear  convolutive  mixtures.  The  problem  consists  of  recovering  the  sources  up  to  shaping 
filters  from  the  observations  of  MIMO  system  output.  The  various  signals  are  assumed  to  be  linear 
non-Gaussian  but  not  necessarily  i.i.d.  (independent  and  identically  distributed).  Recently  an 
iterative,  normalized  higher-order  cumulant  maximization  based  approach  was  developed  using 
the  fourth-order  normalized  cumulants  of  the  “beamformed”  data.  This  approach  was  source- 
iterative,  i.e.,  the  sources  were  extracted  (at  each  sensor)  and  cancelled  one-by-one,  in  the  process 
yielding  a  decomposition  of  the  given  data  at  each  sensor  into  its  independent  signal  components. 
In  this  paper  an  adaptive  implementation  of  the  above  approach  is  developed  using  a  stochastic 
gradient  approach.  Some  further  enhancements  including  a  Wiener  filter  implementation  for  signal 
separation  and  adaptive  filter  reinitialization  are  also  provided.  Computer  simulation  examples  are 
presented  to  illustrate  the  proposed  approach. 
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1  Introduction 


Given  noisy  measurements  yi{k),  (i  =  1,2, •  •  •,  JV^),  at  time  k  aX  N  sensors,  let  these  measurements 
be  a  linear  convolutive  mixture  of  M  source  signals  Xj{k),  {j  =  1,2,  •  •  -  ,  M); 

M 

yi{k)  =  'Y^Uij{z)xj{k)  ■]- ni{k) ,  i  =  1, 2,  •  •  • ,  iV,  (1-1) 

j=i 

y{k)  -  U{z)yi{k)  +  n(A:),  (1'2) 

where  ij-th  element  of  U{z)  is  Uijiz),  y{k)  =  [yi{ky.y2ik)':- ■  ■‘:yN{k)f ,  similarly  for  x{k)  and 
n{k),  is  both  the  backward-shift  operator  (i.e.,  z-^x{k)  =  x{k  - 1),  etc.)  as  well  as  the  complex 
variable  in  the  .^-transform,  Xj{k)  is  the  j-th  input  at  sampling  time  k,  yi{k)  is  the  i-th  output, 
ni{k)  is  the  additive  Gaussian  measurement  noise,  and 

OO 

Uij{z)  :=  £  (1-3) 

Z=— OO 

is  the  scalar  transfer  function  with  xj{k)  as  the  input  and  yi{k)  as  the  output.  We  aUow  all  of  the 
above  variables  to  be  complex- valued. 

Suppose  that  we  design  a  MIMO  dynamic  system  6{z)  with  N  inputs  and  M  outputs  such  that 
the  overall  M  x  M  system 

T{z)  :=  e{z)U{z)  (1-4) 

decouples  the  source  signals.  Following  the  2  x  2  case  considered  in  [7],  this  implies  that  we  must 
have  {Tij{z)  denotes  the  iy-th  element  of  T{z)) 

Tij{z)  =0  for 

y  0  for  i  =  ij 

where  i  =  1, 2,  •  •  • ,  M ;  j  =  1, 2,  •  •  • ,  M  and  6  {1, 2,  •  •  • ,  such  that  ij ii  for  j  f  1.  That  is, 
in  every  column  and  every  row  of  T{z)  there  is  exactly  one  non-zero  entry.  In  a  blind  separation 
problem,  the  nonzero  entries  of  'Ti^z)  are  allowed  to  be  a  scalar  linear  system  (shaping  filter),  unlike 
the  equalization  problems  where  they  must  be  constant  gains  and/or  pure  delays. 

The  problem  considered  above  arises  in  a  wide  variety  of  applications:  in  array  processing  for 
wideband  sources  under  multipath  propagation,  in  speech  enhancement  (“cocktail  party”  problem), 
and  in  noise  cancellation  where  the  reference  microphone  does  not  measure  noise  alone  (“crosstalk”); 
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see  [l]-[8],  [20],  [23]-[25],  [29]-[33]  and  references  therein.  Blind  source  separation  becomes  neces¬ 
sary  when  the  propagation  between  sources  and  sensors  can  not  be  accurately  modeled  for  lack  of 
knowledge  of  multipath  environment,  unknown  (or  imprecisely  known)  array  manifold,  etc.  Sepa¬ 
ration  of  sources  differs  from  blind  equalization  [9],[10],[13],[14],[17]  in  that  the  source  signals  are 
not  necessarily  i.i.d.  (independent  and  identically  distributed). 

The  problem  of  blind  source  separation  has  received  increasing  attention  in  the  past  few  years 
beginning  with  [1].  The  work  done  can  be  classified  into  two  broad  categories  based  upon  the 
underlying  propagation  model:  instantaneous  mixtures  and  convolutive  mixtures.  In  linear  instan¬ 
taneous  mixture  models  the  transfer  function  U{z)  in  (1-2)  is  a  constant  matrix  (caUed  the  mixing 
matrix).  This  case  arises  for  narrowband  signals  where  any  multipath  produces  relative  delays 
small  enough  to  cause  just  a  phase  shift  [33].  The  work  reported  in  [1],  [3]-[6],  [8],  [16],  [29],  [33] 
and  [12]  deals  with  this  class  of  models;  this  list  is  by  no  means  exhaustive,  see  also  references 
therein.  The  general  model  (1-2)  represents  a  linear  convolutive  mixture.  The  work  reported  in  [2], 
[7]>  [l^]’  [19]’  [20]’  [30]'[32]  and  [11]  (and  references  therein)  deals  with  linear  convolutive  mixture 
(dynamic  mixing)  models.  In  this  paper  we  consider  dynamic  mixing  where  we  allow  N  >  M  {N  = 
number  of  sensors,  M=  number  of  sources)  with  M  arbitrary,  whereas  quite  a  few  existing  papers 
are  restricted  to  either  M  =  N  =  2  ([7], [24], [25])  ov  M  =  N  ([2],[17],[11]). 

Past  work  on  separation  of  convolutive  mixtures  may  be  categorized  into  several  classes:  time- 
domain  approaches  ([2],  [17],  [19],  [20],  [23],  [24],  [25],  [30],  [31],  [11]),  frequency-domain  approaches 
([7],[32]),  adaptive  (recursive)  approaches  ([17],  [24],  [25],  [30]),  and  non-recursive  (batch)  ap¬ 
proaches  ([2],  [7],  [8],  [18]-[20],  [23],  [24],  [30],  [32],  [11]).  In  this  paper  we  present  time-domain 
adaptive  approaches.  As  noted  earlier  quite  a  few  of  existing  approaches  are  limited  either  to 
M  =  N  =  2  ([7],  [24],  [25])  or  to  M  =  JNT  ([2],[17],  [11]).  Although  [31]  and  [32]  treat  a  general 
case,  their  analyses  are  restricted  to  the  case  of  two  sources  (M  =  2).  In  this  paper  we  consider  a 
general  case  of  N  ^  M  with  M  arbitrary. 

A  key  assumption  made  in  this  paper  is  that  the  various  sources  emit  linear  non-Gaussian 
signals  (i.e.,  signals  that  can  be  represented  as  the  output  of  a  stable  linear  system  driven  by  an 
i.i.d.  non-Gaussian  sequence);  this  assumption  also  appears  in  [19], [20]  and  [11].  This  assumption 
is  clearly  satisfied  by  most  digital  communications  signals.  This  allows  one  to  treat  the  problem  (or 
a  crucial  part  of  it)  as  a  (blind)  linear  system  identification  problem  using  higher-order  statistics. 
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Therefore,  existing  results  on  blind  system  identification  ([9],  [10],  [13],  [19]-[21]  and  [11])  become 
quite  relevant;  however,  most  of  these  papers  investigate  non-recursive  approaches. 

As  noted  earlier  blind  source,  separation  is  not  blind  equalization.  It  is  separation  of  a  dynamic 
mixture  of  signals  into  its  independent  component  signals  (or  a  linearly  filtered  version  thereof). 
In  [7]  (and  others)  this  is  handled  by  assuming  that  (for  M  =  N)  the  diagonal  entries  of  l{{z) 
are  unity.  We  will  not  follow  this  approach  as  it  is  not  clear  how  to  extend  this  to  iV  >  M,  and 
moreover,  we  allow  a  row  of  ^(z)  to  be  identically  zero  (“faulty”  sensor)  so  that  it  may  not  always 
be  possible  to  make  such  a  choice.  Our  objective  in  blind  separation  is  to  decompose  y(fc)  of  (1-2) 
into  its  independent  components  U^*\z)xi{k)  denotes  the  i— th  column  of  W(z))  without 

having  a  prior  knowledge  of  t{(^z).  A  batch  (non- adaptive)  version  of  this  paper  appears  in  [23]. 
Independently  of  [23],  a  similar  approach  using  only  second-order  statistics  appeared  in  [30].  In 
[30]  it  is  required  that  rank{W(2)}  =  M  for  any  z  (including  z  =  oo  but  excluding  z  =  0)  whereas 
we  require  rank{W(z)}  =  M  only  for  jzj  =  1.  Note  that  the  rank  restriction  of  [30]  impUes  that 
N  M  for  nontrivial  systems  (else  for  N  —  M,  W(z)  will  be  unimodular,  i.e.,  its  determinant 
is  a  constant.)  In  our  formulation  we  allow  N  >  M.  On  the  other  hand,  [30]  does  not  require 
the  signals  {x{k)}  to  be  non-Gaussian  or  linear  since  any  stationary  second-order  process  admits 
a  linear  representation  (Wold’s  decomposition  [15]).  However,  our  formulation  relies  crucially  on 
{x(A:)}  being  linear  non-Gaussian.  Finally,  given  {y{k)},  we  also  seek  a  minimum  mean-square  error 
(MMSE)  estimate  of  U^'\z)xi{k)  whereas  [30]  (or  [11])  does  not.  A  system  identification  approach 
to  blind  source  separation  is  Mowed  in  [19],  [20]  and  [11].  However,  the  results  in  these  papers 
deal  with  identifiability  issues  and  no  specific  algorithm  for  source  separation  has  been  provided 
therein.  Moreover,  the  MMSE  source  separation  approach  considered  this  paper  is  not  considered 
in  [19],  [20]  and  [11]. 

We  now  turn  to  a  brief  review  of  past  work  on  adaptive  blind  system  identification  as  it  re¬ 
lates  to  the  problem  under  consideration.  An  interesting  input-iterative  adaptive  approach  using 
prewhitened  observations  and  the  fourth-order  cumulant  of  the  inverse-filtered  data  at  zero-lag  has 
been  considered  in  [21]  and  [17].  The  inverse  filter  is  constrained  to  have  a  lossless  filter  structure 
which  is  realized  using  a  lossless  lattice  filter.  Such  a  restriction  can  lead  to  iU- conditioning  of  the 
algorithm  of  [21]  as  one  iteratively  extracts  input  sequences.  A  fix  to  this  is  proposed  in  [17]  but  it 
works  only  for  the  two-input  case.  Refs.  [21]  and  [17]  are  restricted  to  ‘square’  systems:  number  of 
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inputs  (M)  equal  to  the  number  of  outputs  (N),  whereas  in  this  paper  we  allow  N  >  M,a.  common 
occurrence  in  array  processing.  Moreover,  in  this  paper  we  perform  no  prewhitening,  rather  we 
operate  directly  on  the  given  measurements.  A  consequence  of  this  is  that  the  ill-conditioning  of 
[21],[17]  referred  to  above  does  not  occur  in  our  approach.  Refs.  [21]  and  [17]  are  restricted  to 
real-valued  data  whereas  we  also  consider  complex-valued  observations. 

In  [23]  an  iterative,  inverse  filter  criteria  based  approach  has  been  developed  for  blind  separation 
of  multichannel  non-Gaussian  processes  using  the  fourth-order  normalized  cumulants  of  the  inverse 
filtered  data  at  zero-lag.  The  approach  is  input-iterative,  i.e.,  the  inputs  are  extracted  and  removed 
one-by-one.  The  matrix  impulse  response  is  then  obtained  by  cross- correlating  the  extracted  inputs 
with  the  observed  outputs.  A  by-product  of  this  approach  is  a  decomposition  of  the  given  data  at 
each  sensor  into  its  independent  signal  components,  thereby  achieving  bUnd  signal  separation.  In 
this  paper  we  develop  a  stochastic  gradient-based  “recursification”  of  all  of  the  batch  optimization 
steps  in  [23]. 

The  paper  is  organized  as  follows.  The  precise  model  assumptions  and  our  basic  approach  to 
blind  separation  of  convolutive  mixtures  are  described  in  Sec.  2.  The  inverse-filter  critena-based 
approach  of  [13],  the  underlying  identifiability  results  and  the  source  separation  solution  implicit 
in  the  solution  of  [13]  (see  also  [23])  are  all  briefly  discussed  in  Sec.  3.  In  Sec.  4  we  develop  a 
stochastic  gradient-based  “recursification”  of  all  of  the  batch  optimization  steps  discussed  in  Sec. 
3.  An  MMSE  solution  with  controlled  delay  (d  in  Sec.  5),  to  the  problem  of  blind  signal  separation 
given  the  channel  impulse  response  estimates  is  discussed  and  analyzed  in  Sec.  5.  Finally,  two 
computer  simulation  examples  are  presented  in  Sec.  6  to  illustrate  the  proposed  approaches. 

2  Model  Assumptions  and  Signal  Separation 

We  impose  the  following  conditions  on  model  (l-l)-(l-2): 

(ASl)  N  >  M,  i.e.,  there  are  at  least  as  many  outputs  as  inputs. 

(AS2)  The  vector  sequence  {x(A:)}  is  stationary,  its  various  components  are  mutually  inde¬ 
pendent,  and  the  coupling  system  (i.e.  the  transfer  function  U{z))  is  stable.  Moreover, 
{x(A:)}  is  linear,  i.e. 

x(A:)  =  V(^:)w(A:), 
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wliGiG  {w(fc)}  is  a  zGio-niGan,  -M—VGCtor  stationary  non-Gaussian  procGSS,  tGmporally 
i.i.d.  and  spatially  indGpGndGnt,  with  nonzGro  fourth  cumulants.  BocausG  of  tho  mutual 
indepcndcncG  of  the  components  of  x(fc),  we  take  V{z)  to  be  diagonal. 

(AS3)  Consider  the  composite  system 

y{k)  =  U{z)V{z)^{k)  +  n{k)  =:  T{z)^{k)  +  n(A:).  (2-2) 

Assume  that  rank{J^(z)}  =  M  for  any  \z\  =  1. 

(AS4)  Since  the  composite  system  is  causal,  we  have 

oo  Lf 

r{z)  =  Ef,.-'  »  (2-3) 

J=0  1=0 

(ASS)  The  noise  {n(A:)}  is  a  zero-mean,  stationary  Gaussian  sequence  independent  of  {w(fc)}. 
Moreover,  it  is  ergodic. 

Note  that  by  (AS2)  and  (ASS),  the  statement  rank{^(z)}  =  M  for  any  \z\  =  1  is  equivalent  to  the 
statement  rank{W(z)}  =  M  for  any  |z|  =  1.  Since  J^{z)  is  stable,  for  HR  models  (2-3)  acts  as  a 
“good”  approximation  for  large  L.  We  will  not  require  precise  knowledge  of  L  in  the  sequel.  It  is 
convenient  to  assume  an  FIR  model.  We  will  denote  the  i;-th  element  of  T{z)  is  J^ij{z). 

Let  denote  the  i-th  column  of  J^{z).  In  blind  convolutive  signal  separation  we  are 

interested  in  decomposing  the  observations  at  the  various  sensors  into  its  independent  components. 
That  is,  our  objective  is  to  estimate  J^^^\z)wi{k)  for  i  =  1,2,---,M  given  {y(A;)}  without  having 
a-prior  knowledge  of  J='{z).  Note  that  this  is  different  from  the  solutions  in  [2]  and  [7]  (and  others) 
where  one  obtains  a  “single”  estimate  of  Xi(A:)  (or  a  “shaped”  version  of  it);  recall  (1-4)  and  (1-5). 
By  discarding  all  but  one  of  the  N  entries  of  the  A -vector  J^(^{z)wi{k),  we  can  get  the  solution 
specified  by  (1-5)-  Because  of  inherent  scale  and  shift  ambiguities  (see  Remark  1  in  Sec.  3)  we 
wiU  end  up  estimating  a  scaled  and  shifted  version  of  J^^^{z)wi{k).  Thus,  by  assuming  Unearity 
of  {x(A:)}  (cf.  (2-1)),  we  have  converted  the  blind  signal  separation  problem  into  a  blind  MIMO 
channel  identification  and  deconvolution  problem  to  which  a  solution  exists  in  [13].  It  was  shown 
in  [23]  as  to  how  the  solution  of  [13]  applies  to  the  current  problem.  In  this  paper  we  develop  an 
adaptive  implementation  of  of  the  approach  of  [23].  Also,  the  deconvolution  solution  of  [13]  is  not 
necessarily  an  MMSE  (minimum  mean-square  error)  solution.  To  remedy  this  and  to  design  MMSE 
estimators  with  “controUed  delay”  (d  in  Sec.  5),  we  also  consider  other  modifications.  Using  the 
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channel  identification  results  of  [13],  we  consider  designing  adaptive  MMSE  estimates  of  (scaled 
and  shifted  versions)  of  J^^'’\z)wi{k). 

3  An  Iterative  Solution  Based  on  Inverse-Filter  Criteria 

In  [13]  an  iterative,  inverse  filter  criteria  based  approach  has  been  developed  for  deconvolution  of 
multichannel  non-Gaussian  processes  using  the  fourth-order  normalized  cumulants  of  the  inverse 
filtered  data  at  zero-lag.  In  [23]  this  approach  has  been  applied  to  the  blind  convolutive  signal 
separation  problem  using  a  non-recursive  algorithm.  The  approach  is  input-iterative,  i.e.,  the  inputs 
are  extracted  and  removed  one-by-one.  The  matrix  impulse  response  is  then  obtained  by  cross- 
correlating  the  extracted  inpnts  with  the  observed  outputs.  In  this  paper  we  develop  a  stochastic 
gradient-based  “recursification”  of  all  of  the  batch  optimization  steps  in  [13]  and  [23].  In  this  section 
we  briefly  discuss  the  batch  (non-recursive)  approach  of  [23];  its  adaptive  version  is  developed  in 

Sec.  4. 

Let  CUM4(u))  denote  the  fourth-order  cumulant  of  a  complex-valued  scalar  zero-mean  random 
variable  w,  defined  as 

CUM4('u;)  :=  cum.4{w,w*,'w,w*}  =  -  2[£?{|'u;[^}]^  —  \E{w^}\'^  (3-1) 

where  *  denotes  complex  conjugation  [15].  We  will  use  the  notation  '^iwi  =  CUM4(u;i(fc))  and 
^2  ,  _  E{\wi{k)\‘^}.  Consider  an  1  X  iV  row- vector  polynomial  equalizer  (filter)  C^(z),  with  its  j-th 
entry  denoted  by  Cj{z),  operating  on  the  data  vector  y{k).  Let  the  equalizer  output  be  denoted  by 

e{k): 

,(k)  =  £ci(^)!,i(k).  (*-2) 

t=l 

Following  [13]  consider  maximization  of  the  cost 

_  |CUM4(e(fe))|  (3.3) 

-  [E{\e{k)m 

for  designing  a  linear  equalizer  to  recover  one  of  the  inputs.  It  is  shown  [13]  that  when  (3-3)  is 
maximized  w.r.t.  C{z),  then  (3-2)  reduces  to 

e{k)  =  dwj,{k-ko),  (3-4) 
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where  d  is  some  complex  constant,  kg  is  some  integer,  jg  indexes  some  input  out  of  the  given  M 
inputs,  i.e.,  the  equalizer  output  is  a  possibly  scaled  and  shifted  version  of  one  of  the  system  inputs. 
It  has  been  established  in  [13]  that  under  (ASl)-(AS4)  and  no  noise,  such  a  solution  exists  and  if 
doubly-infinite  equalizers  are  used,  then  all  locally  stable  stationary  points  of  the  given  cost  w.r.t. 
the  equalizer  coefficients  are  also  characterized  by  solutions  such  as  (3-4). 

An  source-iterative  solution  is  given  by: 

Step  1.  Maximize  (3-3)  w.r.t.  the  equalizer  C{z)  to  obtain  (3-4).  Let 

74,-„  =  CUM4(e(fe))  =  CUM4(dmjo(fc))- 

Step  2.  Cross-correlate  {e{k)}  (of  (3-4))  with  the  given  data  (2-2)  and  define  a  possibly  scaled 
and  shifted  estimate  of  fijo{T)  as 


E{yi{k)e*{k-T)} 

E{\e{k)\^} 


(3-6) 


where  Fij{z)  =  Ez=-oo  Consider  now  the  reconstructed  contribution  of  e{k) 

to  the  data  yi{k)  (i  =  1,2,  •  ■  A),  denoted  by  yijoik)- 

I 

step  3.  Remove  the  above  contribution  from  the  data  to  define  the  outputs  of  a  MIMO  system 
with  N  outputs  and  M  -  1  inputs.  These  are  given  by 

y'iik)  ■■=  yi{k)  -  yi,jo{l^)- 

Step  4.  If  M  >  1,  set  M  4-  M  -  1,  yi{k)  4-  y[{k),  and  go  back  to  Step  1,  else  quit. 

In  practice,  all  the  expectations  in  (3-6)  are  replaced  with  their  sample  averages  over  appropriate 
data  records. 


It  has  been  shown  in  [13]  that 


(3-9) 


i.e.,  we  have  decomposed  the  observations  at  the  various  sensors  into  its  independent  components: 
yijoik)  in  (3-9)  represents  the  contribution  of  {wj^{k)}  to  the  z-th  sensor  achieving  blind  signal 
separation. 

Remark  1.  It  has  been  shown  in  [13]  that  under  the  conditions  (AS1)-(AS4)  and  no  noise, 
the  proposed  iterative  approach  is  capable  of  blind  identification  of  a  MIMO  transfer  function 
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T{z)  up  to  a  time-shift,  a  scaling  and  a  permutation  matrix  provided  that  we  allow  doubly-infinite 
equalizers.  That  is,  given  we  end  up  with  a  A{z)  where  the  two  are  related  via 

A{z)  =  :r(z)DAP  (3-10) 

where  D  is  an  M  X  M  “time-shift”  diagonal  matrix  with  diagonal  entries  such  as  2“*^°  (recall  (3-4)), 
A  is  an  M  X  M  diagonal  scaling  matrix,  and  P  is  an  M  x  M  permutation  matrix.  The  following 
result  has  been  proved  in  [13]. 

Theorem  1[13]:  Given  the  model  (2-2)  such  that  n{k)  =  0  and  given  the  true  4th-order  and 
2nd-order  cumulant  functions  of  the  model  output  {y{k)}  such  that  conditions  (AS1)-(AS4)  hold 
true.  Suppose  that  doubly  infinite  equalizers  are  used  in  steps  1-4  of  the  iterative  procedure  of  Sec. 
3.  Then  this  procedure  yields  a  transfer  function  A{z)  satisfying  (3-10).  •  □ 

Remark  2.  The  results  of  [13]  are  based  upon  the  use  of  doubly-infinite  inverse  filters.  If  we 
assume  that  :F(z)  has  finite  impulse  response  (FIR)  and  rank{:F(2)}  =  M  for  any  2  (including 
z  z=  00  but  excluding  2  =  0),  then  finite  length  inverse  filters  suffice.  For  an  analysis  and  further 
elaborations,  see  [22]  and  [16]  where  a  Godard  cost  function  is  considered  but  the  results  of  [22]  and 
[16]  can  be  easily  modified  to  apply  to  the  cost  (3-3)  and  the  basic  conclusions  remain  unchanged. 
The  following  result  follows  from  [13]  and  [16]. 

Theorem  2;  Given  the  FIR  model  (2-2)  such  that  n(A:)  =  0  and  conditions  (ASl)  and  (AS4) 
hold  true.  Suppose  that  steps  1-4  of  the  iterative  procedure  of  Sec.  3  are  used  and  the  record  length 
tends  to  infinity.  Then  this  procedure  yields  a  transfer  function  ^.(2)  satisfying  (3-10)  if  one  of  the 
following  holds  true: 

(A)  Rank{:F(2)}  =  M  for  any  2  (including  2  =  00  but  excluding  2  =  0),  and  doubly-infinite 
equalizers  are  used. 

(B)  Rank{J^(2)}  =  M  for  any  2  (including  2  =  00  but  excluding  2  =  0),  J^{z)  is  column-reduced 
and  FIR  equalizers  with  length  Le  >  (2M  —  l)ic  —  1  used  where  Lc  =  channel  length. 

•  □ 

4  Adaptive  Algorithm 

In  this  section  we  develop  a  stochastic  gradient-based  “recursification”  of  aU  of  the  batch  opti¬ 
mization  steps  discussed  in  Sec.  3.  Theorems  1  and  2  of  Sec.  3  motivate  and  justify  the  algorithm 
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developed  in  this  section. 


4.1  First  Stage  Maximization  of  Normalized  Fourth  Cumulant 
Let  the  length  of  the  equalizer  C{z)  be  Le  and  let 


=  XI  ‘^‘(^)^  *• 

1=0 

(4-1) 

This  allows  us  to  rewrite  (3-2)  as 

e(^)  =  X  X  =  C^Y(A:) 

i=l  1=0 

(4-2) 

where 

'T 

Y(A:)  =  [  Y^{k)  ■■■  y^(fc)  ]  , 

(4-3) 

r  1^ 

Yi(fc)  =  yi{k)  yi{k  -  1)  •  •  •  yi{k  -  ie  -t-  1)  J  > 

(4-4) 

c{k)=[cj  Ci  ••• 

(4-5) 

and 

T 

Ci  =  [  Ci(0)  Ci(l)  .  •  •  Ci(Le  -  1)  ]  • 

(4-6) 

Define 

m4  =  E{le(A:)i‘‘},  m2  =  ^2  = 

(4-7) 

Then  showing  explicit  dependence  upon  C,  (3-3)  may  be  rewritten  as 

.  X  [m4-lm2l^  ^ 

J(C)  -  sgn(74)  2  2 

7712 

(4-8) 

where 

74  =  7714  —  2  TTlg  —  1^2^. 

(4-9) 

Let  Vc  denote  a  gradient  operator  (w.r.t.  a  vector  C).  We  will  follow  [26]  in  formally  defining 
the  complex  derivatives.  Then  we  have 

Vc-€(fc)  =  0  and  Vc*  e*(fc)  =  Y*(fc).  (4-10) 
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(4-11) 


(4-12) 


Using  the  above  results  in  (4-7)  we  have 

Vc‘m4  =  2E{e\k)e*{k)Y*ik)},  Vc-W2  =  E{e{k)Y*{k)} 

and 

Vc*^2  =  0)  Vc*^2  =  2  E{e*{k)Y*{k)}. 

Using  (4-8)-(4-12)  and  after  some  simplification,  we  have 
Vc**^(C)  = 

2sgn(7^  ^m2E{\eik)\‘^e{k)Y*{k)}  -  m2m2E{e*{k)Y*{k)}  -  [m^  -  \fh2\'^]E{e{k)Y*{k)}]  .(4-13) 
mf  '■ 

We  will  use  a  stochastic  gradient  method  for  recursification  of  maximization  of  J(C)  using  an 
‘instantaneous’  gradient  as  an  estimate  of  (4-13).  Given  the  estimate  C(fc  -  1)  of  the  tap-gains  at 
time  k  —  1,  the  stochastic  gradient  method  computes  the  update  C(A:)  at  time  k  as 

C(Jfc)  =  C(A:  -1)  +  fii  S7C*  Jk{C{k  -  1))  (4-14) 

C(ft)  =  -M.  (4-15) 

^  ^  11C(A:)|1 

where  nt  is  the  update  step-size  and  Vc*‘^fc(^(^  ~  1))  instantaneous  gradient  of  the  cost  J 
(w.r.t.  C*)  at  time  k  evaluated  at  C{k  -  1).  Since  the  cost  J  is  invariant  any  scaling  of  C,  we 
normalize  C  in  (4-15)  to  have  a  unit  norm.  From  (4-13)  we  have  the  approximation 

Vc*«^fe(C(fc))  =  sgn(74fc)^  {[m2fc  {e^{k)  -  m2fc)  e*(fc)  -  (7714*=  -  lm2fc|^)  e(A:)]  Y*(fc)}  (4-16) 

2k 

where 

m2k  =  (1  -  M2)^2(A:-1)  +  /^2le(fc)l^)  ("^ 

ffl2k  =  (1  ~  M2)^2(fc-l)  +  fj'2^'^{k),  (4  IS) 

7n4fc  =  (1  —  M2)”^4(fc-1)  +  M2|c(fc)l^,  (4  19) 

74fc  =  m4fc  -  2m|fc  -  lm2fcl^  (4-20) 

and 

e{k)  =  C'^{k)Y{k).  (^'2^) 

In  (4-17)-(4-19)  the  various  quantities  represent  estimates  based  upon  sample  averaging,  the  (ex¬ 
ponential  window)  memory  being  controUed  by  the  forgetting  factor  fi2  (0  <  M2  <  !)•  The  initial¬ 
izations  for  (4-17)-(4-19)  are:  m2o  =  ^40  =  ^20  =  0. 
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4.2  First  Stage  Signal  Cancellation 

Now  we  discuss  implementation  of  (3-6)  via  sample  averaging  using  an  exponential  window  con¬ 
trolled  by  a  forgetting  factor  1x3.  Define  {Li,L2  >  0) 

—  impulse  response  from  {e{k)}  (input)  to  (output)  (4-22) 


and 


T 


rn=[/i(n)  /2(n)  fN{n)\  • 

By  Sec.  3,  when  (4-8)  is  maximized,  e{k)  satisfies  (3-4) 
there  exists  a  jo  €  2,  •  •  • ,  M}  such  that 


(4-23) 


so  that  for  suitable  choice  of  Li  and  L2, 


F^e(A:-n)  , 


il 


(4-24) 


where  [Aj^j  denotes  the  ij—th.  element  of  the  matrix  A.  Note  that  (4-24)  (cf.  (3-7))  represents  the 
contribution  of  the  extracted  source  at  stage  1  to  the  measurement  at  time  k  at  the  i— th  sensor. 
In  order  to  implement  (3-7)  and  (3-8),  we  need  recursive  estimates  of  F^.  The  estimate  Fn{k)  of 


Fn  at  time  k  is  provided  by 


Fnik)  -  Kn{k)/m,,{k) 


(4-25) 


where 


m.ee(fc)  =  (1  -  fj'3)'mee{k  -1)4-  fl3\e{k)\'^ ,  (4-26) 

K^(k)  =  (1  -  M3)Rn(fc  -  1)  +  t^3y{k)e*{k  -  n).  (4-27) 


4.3  Multistage  Algorithm 

In  Secs.  4.1  and  4.2  we  discussed  the  first  stage  of  the  algorithm  where  we  have  N  sensors  and  M 
sources.  Now  we  put  it  all  together  following  the  source-iterative  solution  of  Sec.  3  and  discuss 
extraction  of  M  sources  including  the  cancellation  of  the  extracted  sources.  We  will  use  the  super¬ 
script  (m)  to  denote  the  various  quantities  pertaining  to  stage  m.  These  have  been  used  previously 
in  Secs.  4.1  and  4.2  without  this  superscript;  for  instance,  C("*)(A:)  now  denotes  the  estimate  of  the 
tap-gain  vector  at  time  k  at  stage  m,  etc. 

Initialization: 
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=  as  in  (4-3),  and  y^^\k)  =  y{k). 

(4-28) 

DO  FOR  m  =  1,2, 

C(-)(fc)  =  C^rn^k-1)  +  Ml  VC*  jtHC^"^\k-l}) 

(4-29) 

^  ^  llC(-)(A:)il 

(4-30) 

where 

VC-^r'CcWCi))  =  .gn(7S">)-|o;{HT>('‘“’'('=>-”» 

'^2k 

-  Kr’  -  ''”’(*))  Y‘">’(fc)} , 

(4-31) 

(4-32) 

(4-33) 

„(“)  =  (1  -  +  (*2|«<"‘>(<=)I*, 

(4-34) 

7Sr’  =  4r’-2-n<”>“-ii!4r’i“ 

(4-35) 

and 

(4-36) 

Set 

n=— Li 

(4-37) 

where  represents  (cf.  (3-7))  the  contribution  of  the  extracted  source 

stage  to  the  measurements  at  time  A:,  and  where  (n  =  —L\,—L\  -|-  1,  •  •  • ,  ^2) 

at  the  m— th 

Ft\k)  =  Bt\k)/mt\k), 

(4-38) 

■mt\k)  =  (1  -  fi3)mt\k  -  1)  +  fi3\e^'^\k)\\ 

(4-39) 
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rM(A:)  =  (1  -  fiz)R^\k  -  1)  +  -  n) 


(4-40) 


and 


y(”^+i)(A:)  =  y(”‘)(/c)  -  y^\k). 


Define 


?/”’W  =  [  sS'“’(i-i)  •■■  !il“’(fc-i=  +  l) 

where  denotes  the  i— th  component  of  y^^\k).  Set 


where 


(4-41) 


(4-42) 


(4-43) 


y(™+i)(Jb)  =  Y^^^\k)  -  Y}'"\k). 


(4-44) 


ENDDO 

The  sequence  {y^'^\k)}  in  (4-37)  represents  the  contribution  of  the  extracted  source  at  the 
m— th  stage  to  the  measurements  at  time  k. 


Remark  3.  If  M  were  unknown  the  proposed  approach  will  still  work  in  the  sense  that  if  M  were 
underestimated,  some  sources  will  be  missed  but  the  extracted  sources  will  correspond  to  one  of  the 
sources.  If  M  were  overestimated,  all  the  sources  will  be  recovered  in  addition  to  some  meaningless 
junk”  outputs  in  stages  Mo  +  1  and  later  where  Mo  denotes  true  number  of  sources.  Indeed  one 
can  test  the  ‘residuals’  (4-44)  (see  also  (3-8))  to  check  if  any  significant  non-Gaussian  components 
remain  in  the  data  before  implementing  another  equalizer  in  parallel.  We  do  not  pursue  this  aspect 
in  this  paper.  □ 


Running  Cost.  To  monitor  the  convergence  of  the  equalizers  in  various  stages  of  the  algorithm, 
it  is  useful  to  calculate  a  running  cost  (4-8)  without  the  sign.  Let  ^  denote  the  running  cost  for 


the  m-th  stage  at  time  k,  given  by 


tC*")  _ 


m. 


(”*)  _ 


•4k 


2k 


m 


(m)2 

2k 


-2 


(4-45) 
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where 


„(»n) 

^2k 

=  (1  - 

(4-46) 

^2k 

=  (1  - 

(4-47) 

(m) 

^\k 

(4-48) 

For  all  of  the  simulations  presented  in  Sec.  6,  we  took  1x4  —  0.002. 

5  Further  Modifications 

5.1  MMSE  Signal  Separation 

5.1.1  Non-recursive  Processing 

Recall  that  our  objective  is  to  estimate  T^\z)wi{k)  iov  i  =  1,2,- ■■,M  given  {y(A:)}.  The  non¬ 
recursive  solution  of  Sec.  3  provides  a  solution  in  the  form  of  (3-7)  (see  also  (3-9)  )  whereas  the 
adaptive  solution  of  Sec.  4  has  it  as  (4-37).  The  deconvolution  solution  of  [13]  is  not  necessarily  an 
MMSE  solution.  It  has  been  shown  in  [35]  and  [36]  (and  references  therein)  that  for  the  constant 
modulus  algorithm,  under  certain  conditions,  the  resultant  solution  (extracted  source  in  the  first 
stage)  may  be  “close”  to  an  MMSE  solution.  It  is  possible  that  a  similar  result  may  hold  for  the 
problem  under  consideration  here.  However,  even  if  it  were  true,  the  resulting  solution  may  not  be 
the  best  possible  because  the  performance  of  an  MMSE  solution  depends  upon  the  “delay”  {d  in 
the  sequel)  used  and  the  blind  algorithms  provide  no  control  over  the  choice  of  the  delay  parameter 
[36].  A  by-product  of  the  solutions  of  Secs.  3  and  4  is  the  estimates  of  the  system /channel  impulse 
response.  These  estimates  can  be  used  to  design  MMSE  estimators  of  with  a  controlled 

delay  d  to  obtain  an  “optimum”  performance.  These  considerations  are  nevertheless  heuristic  as 
we  are  ignoring  any  effects  of  additive  noise  on  the  channel  estimates. 

Let  denote  the  i-th  column  of  Fj.  We  wish  to  design  a  linear  MMSE  filter  (equalizer)  of 
length  Le  +  lto  estimate  y(j)(fc  -  d)  as  y^^\k  -  d)  given  y(0  ioT  I  =  k,k  -  1,- •  ■  ,k  -  1^  + 1  where 
d>  0, 

y^^\k)  :=  J^^^\z)wj{k)  =  '£^F\^'^Wj{k  -  1)  (5-1) 

1=0 
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and 


f\k  -  d)  :=  s'  G,y(i  -  i).  (5-2) 

i=0 

Both  Le  and  the  delay  d  are  “pre-determined.”  Using  the  orthogonality  principle  [27],  tire  normal 
equations  for  MMSE  estimator  are  given  by 

for  /  =  Jk  -  ie  +  1,  ^  -  -^e  +  2,  •  •  • ,  it  where  7i  denotes  the  Hermitian  operation  (complex  conjugate 
transpose).  Using  (2-2),  (2-3),  (5-1),  (5-2)  and  assumption  (AS2),  and  assuming  that  the  system 
model  is  completely  known,  after  some  manipulations  (5-3)  simplifies  to 


L.-l 

X)  GiRyy(p  -  i) 


i=0 


'^3  Z— / 
k~Q 


'wj' 


-p) 


(5-4) 


where 


:= 


A:+d-p 


=  EFii-.F^’'  =  -<()?««(*)} 

fc=0 


(5-5) 


and 


Ryy{p)  :=  E{yit  +  p)y'^it)}.  (5-6) 

Note  that  a  shift  in  the  sequence  leaves  Hd-p  unaffected.  The  desired  solution  when  the 

model  is  completely  known  is  therefore  given  by 

[  Go  Gi  •  •  •  Gl._i  ]  =  <^lj  [  Hd  Hd-i  •  •  •  Hd-L*  ]  (5-'^) 

where  TZyy  denotes  a  [iVie]  X  [JVie]  correlation  matrix  with  Ryy(i  -  i)  as  its  ly-th  block  element. 

In  order  to  obtain  a  data-based  solution,  we  simply  replace  all  the  unknowns  by  their  estimates. 
Since  there  is  an  inherent  scale  ambiguity  in  estimating  the  composite  channel  impulse  response 
(cf.  (3-10)  ),  we  design  the  equalizer  only  up  to  a  scale  factor  by  omitting  <J^j  from  (5-7). 

Remark  4.  Selection  of  Delay  d:  In  designing  (5-2)  the  delay  d  was  pre-determined.  It  is 
well-known  [28]  that  the  choice  of  d  has  a  strong  influence  on  the  resultant  mean-square  error. 
One  may  choose  to  select  d  via  exhaustive  optimization  as  detailed  below.  Using  the  orthogonaUty 
principle  [27],  the  MMSE  when  (5-2)  is  used  is  given  by 

J(d)  :=  -UE  -d)-  y»\k  -  <i)ly«)*(fc  -  i)}  («-8) 
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where  tr  stands  from  trace.  Using  (5-l)-(5'7),  we  can  simplify  (5-8)  to 

J{d)  :=  trE  {yW(fc  -  -  d)}  -  J\d)  (5-9) 

where 

J'{d)  :=  (5-10) 

and 

n  :=  [  Hj  Hd_i  •  •  •  ]  .  (5-11) 

Since  the  first  term  on  the  right-side  of  (5-9)  is  independent  of  d,  minimizing  J{d)  w.r.t.  d  is 
equivalent  to  maximizing  J'{d)  or  In  practice,  we  replace  the  unknowns  in  (5-10)  with 

their  estimates.  □ 

5.1.2  Adaptive  Implementation 

We  now  turn  to  an  adaptive  implementation  of  (5-7).  Note  that  does  not  depend  upon  the 
stage  m  of  the  algorithm  of  Sec.  4.3;  it  depends  solely  upon  the  measured  data.  Its  computation 
can  easily  be  recursified  by  using  the  matrix  inversion  lemma:  see  Table  13.1  on  p.  569  in  [34];  we 
omit  the  details.  Denote  the  data-based  adaptive  estimate  of  TZ^  at  time  k  ss  Tyy{k). 

Let  h[”‘^(A:)  denote  the  estimate  of  Hj  at  stage  m  and  time  k  of  the  multistage  algorithm  of 
Sec.  4.3.  Note  that  Fk”‘^(fc)  in  (4-38)  (see  also  (3-6),  (4-23)  and  (4-25))  denotes  an  estimate  of 
for  some  i  €  {1,2, (up  to  a  scale  factor  and  time  shift,  cf.  Theorem  1).  Therefore,  from 
(5-5)  we  have  the  adaptive  implementation  at  stage  m  as 

=  E  l  =  d,d-l,---,d-Le  +  l.  (5-12) 

n=~Li 

^(m) 

Combining  the  above  two  results,  the  adaptive  MMSE  estimate  with  lag  d,  y  (fc),  at  stage  m 
(corresponding  to  {y(”*)(^)}  in  (4-37)  )  is  given  by 

f^\k)  :=  Y;  Gt\k)y(k  -  i)  (5-13) 

i=0 

where 
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Eqn.  (5-13)  provides  an  approximately  MMSE  blind  signal  separation  solution  at  stage  m  and  time 
k.  [Note  that  we  have  ignored  any  effects  of  additive  noise  on  the  channel  estimates.] 

Selection  of  “optimum”  d  as  discussed  in  Remark  1  recursifies  in  an  obvious  way;  therefore,  the 
details  are  omitted. 

5.2  Adaptive  Filter  Reinitialization 

In  the  source-iterative  (multistage)  approaches  of  Secs.  3  and  4,  any  errors  in  cancelling  the  ex¬ 
tracted  sources  from  the  preceding  stages  I  =  1,2,  affect  the  performance  at  stage  m.  The 

only  stage  that  is  immune  to  this  phenomenon  is  stage  m  =  1.  The  multistage  approach  is  used  to 
make  sure  that  each  stage  converges  to  a  distinct  source.  A  possible  solution  to  alleviate  this  error 
propagation  from  stage-to-stage  is  to  use  parallel  stages  where  we  still  have  M  stages  for  M  sources 
but  they  all  operate  directly  on  the  given  data  record  in  parallel  but  with  different  initializations  of 
the  equalizers  (filters).  The  basic  problem  with  such  an  approach  is  how  to  ensure  that  each  stage 
converges  to  a  distinct  source.  Here  we  propose  to  initialize  the  parallel  stages  using  the  results  of 
the  serial  multistage  implementation  of  Sec.  4.3  coupled  with  an  MMSE  solution  similar  to  that  of 
Sec.  5.1.  A  similar  though  not  identical  approach  has  been  proposed  in  [35]  in  a  slightly  different 
context  where  the  MMSE  initializer  has  not  been  used. 

For  stage  m  =  1,  there  are  no  changes  to  the  algorithm  of  Sec.  4.3.  For  stages  m  >  2,  run  the 
algorithm  of  Sec.  4.3  till  the  running  cost  (4-45)  reaches  a  steady-state.  Given  the  estimates  of  the 
subchannel  impulse  response  at  stage  m,  we  can  design  an  MMSE  filter  (in  a  fashion  similar  to  Sec. 
5.1.2)  to  estimate  Wj{k  -  d)  given  y(Z)  for  f  =  A:,  /c  -  1,  •  •  • ,  A:  -  Z-e  +  1-  Let  the  extracted  Wj{k)  at 
stage  m  be  denoted  by  w^‘^\k).  Mimicking  Sec.  5.1.2,  a  recursive  MMSE  solution  at  stage  m  and 

time  k  is  given  by 

»(")(*- J)  :=  s' 

1=0 

where 

[^'"^A:)  ••• 

=  [  fJ"^^(A:)  fK'^(A:)  •••  F^™^^(A)  0  •••  0  ]  :Pyy(*)-  (5-16) 
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At  stage  771  and  time  A:,  —  d)  is  an  MMSE  estimate  (with  delay  d)  of  for  parallel 

implementation.  Comparing  (4-2)  with  (5-14)  we  see  that  C{z)  =  is  the  desired 

MMSE  initializer. 

Selection  of  “optimum”  d  mimicking  Remark  4  recursifies  in  an  obvious  way;  therefore,  the 
details  are  omitted. 

6  Simulation  Example 

In  this  section  we  present  two  simulation  examples  to  illustrate  the  proposed  approaches.  In  both 
the  examples  Fq  is  of  rank  1  <  M  =  2  implying  that  rank{:r(z)}  <  M  =  2  so  that  J^{z)  does 
not  have  a  finite- length  left  inverse.  In  both  the  examples  the  length  of  the  inverse  filters  was  11 
samples  per  sensor  (output)  for  the  approach  of  Sec.  4.  The  proposed  approach  was  apphed  with 
M  =  2  inverse  filters  and  M  -  1  =  1  signal  cancellers  running  in  parallel,  each  successive  inverse 
filter  put  in  operation  after  waiting  for  200  samples  w.r.t.  the  previous  stage.  To  design  a  channel 
estimate-based  MMSE  signal  separator  (following  Sec.  5),  we  chose  the  length  of  the  MMSE  filter 
as  ic  -t- 1  =  111  the  same  as  for  the  inverse  filters  of  Sec.  4.  Furthermore,  we  chose  the  delay  d  for 
the  MMSE  separator  design  by  following  the  procedure  outlined  in  Remark  4  in  Sec.  5. 

6.1  Example  1 

Consider  a  2-input  2-output  MA(6)  system  model  resulting  in  N=2  and  M=2  in  (2-2).  Its  2  X  2 
transfer  function  J^{z)  was  chosen  as 

0.2  -f  O.Sz-^  4-  OAz-^  0.5  -  O.Oz"^ 

0.32-^  -  0.6z-2  -0.21Z-1  _  0.52-2  +  qj2z-^  +  O.SOz'^  -H  0.21z-®  J 

The  input  {'u;i(fc)}  is  an  i.i.d.  complex  Gaussian-mixture  (independent  and  identically  distributed 
real  and  imaginary  parts  with  the  real  part  being  A/’(0,1)  with  probability  0.9  and  A/'(0,4)  with 
probability  0.1)  with  4th  normalized  cumulant  as  0.7433  .  The  input  {w2{k)}  is  an  i.i.d.  4-QAM 
sequence  with  4th  normaUzed  cumulant  as  -1.  The  additive  noise  is  temporally  and  spatially  white, 
zero-mean,  complex  Gaussian  distributed  (independent  real  and  imaginary  parts).  The  powers  of 


19 


{v)j{k)}  for  j  =  1  and  2  were  scaled  so  as  to  have  jE{|1  —  E{\\J^^^\z)w2(k)\\^}. 


For  signal  separation  the  performance  measure  was  taken  to  be  the  signal-to-interference-and- 
noise  ratio  (SINR)  per  source  signal,  defined  as 


SINRj  = 


•E{||y^^X^)  -  Sy  ^^(A:)|P} 


(6-2) 


~(i) 

where  S  is  that  value  of  the  scalar  a  which  minimizes  —  o.y  this  is  need  to 

remove  the  scale  ambiguity  in  the  design  of  (5-4)  -  it  doesn’t  affect  the  SINR.  As  noted  in  Sec. 
5,  that  the  shift  ambiguities  in  estimating  do  not  have  any  influence  on  the  equalizer  design, 


hence  on  (6-2). 


The  adaptive  approach  of  Sec.  4.3  was  applied  without  as  well  as  with  the  reinitialization  of 
Sec.  5.2.  The  average  signal-to-noise  ratio  (SNR)  per  source  was  taken  to  be  27  dB,  20  dB,  14 
dB  and  7  dB,  respectively,  in  four  sets  of  100  Monte  Carlo  runs  where  the  SNR  for  a  given  source 
E^^\z)wj{k)  is  defined  as 

^E{\\P{z)wj{k)f}  (6.3) 

E{\ni{k)\^} 


SNR  = 


The  initial  guess  for  the  tap  gains  was  taken  to  be  center-tap  initialization:  set  c,(5)  =  1  for  i  =  m 
for  the  m— th  stage  equalizer  (m  =1,2)  with  the  remaining  tap  gains  set  to  zero.  The  algorithm 
step  sizes  and  forgetting  factors  for  each  stage  m  were  chosen  as:  Hi  =  0.0005  in  (4-29),  ^2  =  0.015 
in  (4-32)-(4-34)  and  /is  =  0.0005  in  (4-39)  and  (4-41)  when  <  0  (see  (4-35)),  and  fj-i  =  0.00025 
in  (4-29),  /i2  =  0.0075  in  (4-32)-(4-34)  and  /X3  =  0.0005  in  (4-39)  and  (4-41)  when  7$^^  >  0.  For 
the  running  cost  (4-45)  computation  we  selected  /X3  =  0.002  in  (4-46)-(4-48).  The  parameters  Li 
and  L2  in  (4-37)  (see  also  (4-22)  and  (4-23))  were  selected  as  ii  =  15  and  L2  =  6.  To  design  the 
MMSE  equalizers/filters  (5-13)  or  (5-15)  we  took  =  11  and  d  was  optimized  foUowing  Remark 
4  of  Sec.  5.1.1  over  the  range  [-15,6]. 


Fig.  1  shows  the  evolution  of  the  average  running  cost  (see  (4-45)),  averaged  over  100  Monte 
Carlo  runs  (after  ‘assigning’  each  equalizer  cost  to  its  corresponding  extracted  source)  without  using 
any  filter  reinitialization.  For  the  4-QAM  sources  the  4th-order  normalized  cumulant  equals  -1; 
therefore,  at  convergence,  the  running  cost  (4-45)  should  be  close  to  —1.  In  Fig.  1  we  see  these 
values  to  be  less  than  that  which  is  largely  a  consequence  of  noise  in  the  data  which  affects  only  the 
denominator  of  (4-45)  making  it  larger  than  it  should  be.  Similar  effect  is  seen  for  the  Gaussian 
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mixture  source  whose  4th-order  normalized  cumulant  equals  0.7433  .  Fig.  2  shows  the  evolution 
of  the  average  running  cost  when  reinitialization  (after  12000  samples)  of  Sec.  5.2  is  used.  It 
turns  out  that  source  1  {wi{k):  Gaussian  mixture)  is  extracted  first,  so  that  reinitialization  only 
affects  source  2  (4-QAM). 

Table  I  shows  the  average  SINK  (averaged  over  100  Monte  Carlo  runs)  for  the  two  sources  (as 
per  (6-2))  at  the  end  of  the  run  (i.e.  at  A:  =  18000)  without  and  with  filter  reinitialization,  for 
various  SNR’s.  The  SINR’s  were  computed  using  the  solution  (4-37)  as  well  as  the  MMSE  solution 
of  Sec.  5.1.2.  It  is  seen  that  blind  signal  separation  benefits  from  both,  MMSE  signal  separation  as 
well  as  filter  reinitialization. 


6.2  Example  2 


Consider  a  2-input  3-output  MA(6)  system  model  resulting  in  N=Z  and  M=2  in  (2-2).  Its  3  X  2 
transfer  function  J^{z)  was  chosen  as 


0.2  -t-  0.82;-^  +  OAz-^  0.5  -  O.Zz  ^ 

0.3z-^  -  0.62-2  -O.2I2-1  _  0.52-2  ^  0.72Z-2  -t-  0.362-^  +  0.212"®  • 

0.  0. 


(6-4) 


Notice  that  the  last  row  of  (6-1)  is  identically  zero  signifying  that  the  third  ‘sensor’  is  not  receiving 
any  information  signal,  just  noise.  The  first  two  rows  of  (6-4)  are  identical  to  (6-1).  The  inputs 
{i(;j(A:)}  (;■  =  1,2)  and  additive  noise  are  as  in  Example  1.  The  powers  of  {tUj(fc)}  for  j  =  1 
and  2  were  scaled  as  in  Example  1  to  achieve  equal  average  signal  power  at  the  sensors.  The 
measurement  SNR’s  defined  as  in  (6-3)  and  they  were  selected  as  25.2  dB,  18.2  dB,  12.2  dB  and 
5.2  dB,  respectively,  in  four  sets  of  100  Monte  Carlo  runs. 


The  adaptive  approach  of  Sec.  4.3  was  applied  without  as  well  as  with  the  reinitialization  of  Sec. 
5.2.  The  various  parameters  chosen  for  signal  separation  were  exactly  as  for  Example  1.  Figs.  3  and 
4  are  the  counterparts  to  Figs.  1  and  2,  respectively,  of  Example  1,  and  Table  II  is  the  counterpart 
to  Table  I  of  Example  1.  As  in  Example  1,  it  is  seen  that  blind  signal  separation  benefits  from  both, 
MMSE  signal  separation  as  well  as  filter  reinitialization.  Comparing  with  the  results  of  Example 
1,  it  is  seen  that  the  results  of  Examples  1  and  2  are  quite  close  to  each  other  inspite  of  having  a 
third  “misleading”  sensor  in  Example  2  that  measures  just  noise. 
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7  Conclusions 

The  problem  of  blind  separation  of  independent  linear  non- Gaussian  signals  (sources)  from  their 
linear  convolutive  mixtures  was  considered.  In  [23]  an  iterative,  normalized  higlicr-ordcr  cumulant 
maximization  based  approach  was  developed  using  the  third-order  and/or  fourth-order  normalized 
cumulants  of  the  “beamformed”  data.  The  approach  is  source-iterative,  i.e.,  the  sources  are  ex¬ 
tracted  (at  each  sensor)  and  cancelled  one-by-one,  providing  a  decomposition  of  the  given  data 
at  each  sensor  into  its  independent  signal  components.  In  this  paper  we  developed  a  stochastic 
gradient-barsed  recursification  of  aU  of  the  batch  optimization  steps  in  [23]. 

Some  further  modifications  and  enhancements  were  also  considered.  For  blind  signal  separation 
the  estimated  channel  was  used  to  decompose  the  received  signal  at  each  sensor  into  its  independent 
signal  components  via  an  MMSE  filter  with  a  controUed  delay.  The  proposed  blind  adaptive 
algorithm  and  its  variations  were  illustrated  via  two  simulation  examples. 

8  References 

[1]  C.  Jutten  and  J.  Herault,  “Blind  separation  of  sources,  Part  I;  An  adaptive  algorithm  based  on 
neuromorphic  architecture,”  Signal  Processing,  vol.  24,  pp.  1-10,  1991. 

[2]  J.L.  Lacoume  and  P.  Ruiz,  “Separation  of  independent  sources  from  correlated  inputs,”  IEEE  Trans. 
Signal  Processing,  vol.  SP-40,  pp.  3074-3078,  Dec.  1992. 

[3]  E.  Moreau  and  O.  Macchi,  “New  self-adaptive  algorithms  for  source  separation  based  on  contrast 
functions,”  in  Proc.  IEEE  Signal  Proc.  Workshop  on  Higher-Order  Statistics,  South  Lake  Tahoe,  CA, 
pp.  215-219,  June  1993. 

[4]  L.Tong,  Y.  Inouye  and  R.  Liu,  “Waveform-preserving  blind  estimation  of  multiple  independent  sources,” 
IEEE  Trans.  Signal  Processing,  vol.  SP-41,  pp.  2461-2470,  July  1993. 

[5]  J.F.  Cardoso  and  A.  Souloumiac,  “Blind  beamforming  for  non-Gaussian  signals,”  lEE  Proc.-F,  Radar 
and  Signal  Processing,  vol.  140,  pp.  362-370,  Dec.  1993. 

[6]  P.  Comon,  “Independent  component  analysis,  a  new  concept?,”  Signal  Processing,  vol.  36,  No.  3,  pp. 
287-314,  1994. 

[7]  D.  Yellin  and  E.  Weinstein,  “Criteria  for  multichannel  signal  separation,”  IEEE  Trans.  Signal  Pro¬ 
cessing,  vol.  SP-42,  pp.  2158-2168,  Aug.  1994. 


22 


[8]  A.  Mansour  and  C.  Jutten,  “Fourth-order  criteria  for  blind  source  separation,”  IEEE  Trans.  Signal 
Processing,  vol.  SP-43,  pp.  2022-2025,  Aug.  1995. 

[9]  A.  Swami,  G.B.  Giannakis  and  S.  Shamsunder,  “Multichannel  ARMA  processes,”  IEEE  Trans.  Signal 
Proc.,  vol.  SP-42,  pp.  898-914,  April  1994. 

[10]  L.  Tong,  Y.  Inouye  and  R.  Liu,  “A  finite-step  global  convergence  algorithm  for  parameter  estimation  of 
multichannel  MA  processes,”  IEEE  Trans.  Signal  Processing,  vol.  SP-40,  pp.  2547-2558,  Oct.  1992. 

[11]  Y.  Inouye  and  K.  Hirano,  “Cumulant-based  blind  identification  of  linear  multi-input-multi-output 
systems  driven  by  colored  inputs,”  IEEE  Trans.  Signal  Proc.,  vol.  SP-45,  pp.  1543-1552,  June  1997. 

[12]  X.-R.  Gao  and  R.-W.  Liu,  “General  approach  to  blind  source  separation,”  IEEE  Trans.  Signal  Proc., 
vol.  SP-44,  pp.  562-571,  March  1996. 

[13]  J.K.  Tugnait,  “Identification  and  deconvolution  of  multichannel  linear  non-Gaussian  processes  using 
higher-order  statistics  and  inverse  filter  criteria,”  IEEE  Trans.  Signal  Processing,  vol.  SP-45,  pp.  658- 
672,  March  1997. 

[14]  J.R.  Treichler  and  M.G.  Larimore,  “New  processing  techniques  based  on  the  constant  modulus  adaptive 
algorithm,”  IEEE  Trans.  Acoustics,  Speech,  Signal  Processing,  vol.  ASSP-33,  pp.  420-431,  April  1985. 

[15]  M.  Rosenblatt,  Stationary  Sequences  and  Random  Fields.  Birkhauser;  Boston,  1985. 

[16]  J.K.  Tugnait,  “Blind  spatio-temporal  equalization  and  impulse  response  estimation  for  MIMO  channels 
using  a  Godard  cost  function,”  IEEE  Transactions  on  Signal  Processing,  vol.  SP-45,  pp.  268-271,  Jan. 
1997. 

[17]  P.  Loubaton  and  P.  Regalia,  “Blind  deconvolution  of  multivariate  signals:  a  deflation  approach,”  in 
Proc.  Intern.  Conf.  Commun.,  pp.  1160-1164,  Geneva,  Switzerland,  June  1993. 

[18]  J.K.  Tugnait,  “Parameter  identifiability  of  multichannel  ARMA  models  of  linear  non-Gaussian  signals 
via  cumulant  matching,”  IEEE  Transactions  on  Signal  Processing,  vol.  SP-43,  pp.  3067-3069,  Dec. 
1995. 

[19]  Y.  Inouye  and  B.  Sako,  “Identifiability  of  multichannel  linear  systems  driven  by  colored  inputs  and  its 
application  to  blind  signal  separation,”  in  Proc.  ISCAS-94,  pp.  57-60,  vol.  5,  1994. 

[20]  Y.  Inouye  and  K.  Hirano,  “Blind  identification  of  linear  multi-input-multi-output  systems  driven  by 
colored  inputs  with  applications  to  blind  signal  separation,”  in  Proc.  34th  IEEE  Conf.  Decision  & 
Control,  pp.  715-720,  New  Orleans,  LA,  Dec.  1995. 

[21]  P.  Loubaton  and  P.  Regalia,  “Blind  deconvolution  of  multivariate  signals  using  adaptive  FIR  lossless 
filters,”  in  Proc.  EUSIPCO  92,  pp.  1061-1064,  Brussels,  Belgium,  Aug.  1992. 


23 


[22]  J.K.  Tugnait,  “Spatio-temporal  signal  processing  for  blind  separation  of  multichannel  signals,”  in 
Digital  Signal  Processing  Technology,  Joseph  Picone,  Editor,  Proc.  SPIE  2750,  pp.  88-103,  1996. 
[Proceedings  of  the  SPIE  Conf.  held  in  Orlando,  FL,  April  10-11,  1996.] 

[23]  J.K.  Tugnait,  “On  blind  separation  of  convolutive  mixtures  of  independent  linear  signals,”  in  Proc. 
Eighth  IEEE  Signal  Processing  Workshop  on  Statistical  Signal  and  Array  Processing,  pp.  312-315, 
Corfu,  Greece,  June  24-26,  1996. 

[24]  E.  Weinstein,  A.V.  Oppenheim,  M.  Feder  and  J.R.  Buck,  “Iterative  and  sequential  algorithms  for 
multisensor  signal  enhancement,”  IEEE  Trans.  Signal  Proc.,  vol.  SP-42,  pp.  846-859,  April  1994. 

[25]  U.  Lindgren,  T.  Wigren  and  H.  Broman,  “On  local  convergence  of  a  class  of  blind  separation  algo¬ 
rithms,”  IEEE  Trans.  Signal  Proc.,  vol.  SP-43,  pp.  3054-3058,  Dec.  1995. 

[26]  D.H.  Brandwood,  “A  complex  gradient  operator  and  its  application  in  adaptive  array  theory,”  Proc. 
lEE,  vol.  130,  pts.  F  &  H,  pp.  11-16,  Feb.  1983. 

[27]  H.V.  Poor,  An  Introduction  to  Signal  Detection  and  Estimation.  Springer-Verlag:  New  York,  1988. 

[28]  S.U.H.  Qureshi,  “Adjustment  of  the  position  of  the  reference  tap  of  an  adaptive  equalizer,”  IEEE 
Trans.  Commun.,  vol.  COM-21,  pp.  1046-1052,  Sept.  1973. 

[29]  O.  Macchi  and  E.  Moreau,  “Self-adaptive  source  separation,  part  I:  convergence  analysis  of  a  direct 
linear  network  controlled  by  Herault-Jutten  algorithm,”  IEEE  Trans.  Signal  Proc.,  vol.  SP-45,  pp. 
918-926,  April  1997. 

[30]  N.  Delfosse  and  P.  Loubaton,  “Adaptive  blind  separation  of  convolutive  mixtures,”  in  Proc.  1996 
ICASSP,  pp.  2940-2943,  Atlanta,  GA,  May  7-10,  1996. 

[31]  H.L.  Nguyen  and  C.  Jutten,  “Blind  source  separation  for  convolutive  mixtures,”  Signal  Processing, 
vol.  45,  pp.  209-229,  1995. 

[32]  C.  Serviere,  “Blind  source  separation  of  convolutive  mixtures,”  in  Proc.  Eighth  IEEE  Signal  Processing 
Workshop  on  Statistical  Signal  and  Array  Processing,  pp.  316-319,  Corfu,  Greece,  June  24-26,  1996. 

[33]  J.F.  Cardoso  and  B.H.  Laheld,  “Equivariant  adaptive  source  separation,”  IEEE  Trans.  Signal  Proc., 
vol.  SP-44,  pp.  3017-3030,  Dec.  1996. 

[34]  S.  Haykin,  Adaptive  Filter  Theory,  3rd  Ed.  Prentice-Hall:  Upper  Saddle  River,  NJ,  1996. 

[35]  A.  Mathur  et  al.,  “Convergence  properties  of  the  multistage  constant  modulus  array  for  correlated 
sources,”  IEEE  Trans.  Signal  Proc.,  vol.  SP-45,  pp.  280-286,  Jan.  1997. 

[36]  L.  Tong  and  H.H.  Zeng,  “Channel  surfing  reinitialization  for  the  constant  modulus  algorithm,”  IEEE 
Signal  Proc.  Letters,  vol.  SPL-4,  pp.  85-87,  March  1997. 


24 


Table  1:  Example  1:  Average  SINR  after  blind  separation  with  record  length  =  18000  samples. 
Serial:  Algorithm  of  Sec.  4.3;  Parallel:  Algorithm  of  Sec.  4.3  coupled  with  reinitialization  of  Sec. 

5.2. 


SOURCE  1  (Gaussian  mixture) 

SOURCE  5 

(4-QAM) 

SNR 

serial 

parallel 

serial 

par< 

allel 

(4-37) 

MMSE 

(4-37) 

MMSE 

(4-37) 

MMSE 

(4-37) 

MMSE 

27  dB 

8.656 

10.785 

8.656 

10.785 

11.620 

12.815 

16.618 

15.689 

20  dB 

8454 

10.428 

8.454 

10.428 

11.203 

12.280 

15.333 

14.647 

14  dB 

7.828 

9.346 

7.828 

9.346 

9.886 

10.695 

12.576 

12.256 

7  dB 

5.957 

6.594 

5.957 

6.594 

6.548 

6.932 

7.807 

7.718 

Table  2:  Example  2:  Average  SINR  after  blind  separation  with  record  length  =  18000  samples. 
Serial:  Algorithm  of  Sec.  4.3;  Parallel:  Algorithm  of  Sec.  4.3  coupled  with  reinitialization  of  Sec. 
5.2. 


SOURCE  1  (Gaussian  mixture) 

SOURCE  2 

(4-QAM) 

SNR 

serial 

parallel 

serial 

parallel 

I 

(4-37) 

MMSE 

(4-37) 

MMSE 

(4-37) 

MMSE 

(4-37) 

MMSE 

25.2  dB 

8.653 

10.667 

8.653 

10.667 

11.621 

12.647 

16.123 

15.271 

18.2  dB 

8.447 

10.317 

8.447 

10.317 

11.198 

12.134 

15.078 

14.351 

12.2  dB 

7.807 

9.253 

7.807 

9.253 

9.876 

10.591 

12.445 

12.070 

5.2  dB 

5.893 

6.511 

5.893 

6.511 

6.505 

6.862 

7.746 

7.626 
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Fig.  2.  Example  1 

(Algorithm  of  Sec.  4.3  with  reinitialization  of  Sec.  5.2) 
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Abstract 

This  paper  is  concerned  with  the  problem  of  blind  adaptive  deconvolution  of  multiple  com¬ 
munications  signals  and  estimation  of  the  matrix  impulse  response  function  of  the  underlying 
multiple-input  multiple-output  system  given  only  the  measurements  of  the  vector  output  of  the 
system.  The  multiple  signals  are  received  at  an  antenna  array  in  the  presence  of  both  interuser  as 
well  as  intersymbol  interference.  Recently  a  source-iterative,  inverse  filter  criteria  based  approach 
was  developed  using  the  fourth-order  normalized  cumulants  of  the  inverse  filtered  data  at  zero- 
lag.  The  approach  was  input-iterative,  i.e.,  the  inputs  were  extracted  and  removed  one-by-one. 
The  matrix  impulse  response  was  then  obtained  by  cross-correlating  the  extracted  inputs  with  the 
observed  outputs.  In  this  paper  an  adaptive  implementation  of  the  above  approach  is  developed 
using  a  stochastic  gradient  approach.  Computer  simulation  examples  are  presented  to  illustrate 
the  proposed  approach. 
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1  Introduction 


Multiuser  wireless  communications  systems  have  attracted  considerable  attention  in  recent  years. 
Ilecause  ol  limited  frec]^uciicy  spectrum  allocated,  api^roaches  that  lead  to  inci eased  spectium  effi" 
ciency  are  of  much  interest.  One  promising  concept  is  to  use  antenna  arrays  to  discriminate  among 
signals  that  have  distinct  spatial  signatures  (multipaths  etc.)  -  SDMA  (space  division  multiple  ac¬ 
cess)  [9]-[ll].  This  allows  several  sources  using  the  same  carrier  frequency  to  also  use  the  same  time 
slot  in  a  given  cell  thereby  increasing  the  system  capacity.  In  this  paper  we  consider  the  problem  of 
separating  multiple  signals  (including  possibly  non-digital  communications  interferences)  received 
at  an  antenna  array.  The  signals  are  allowed  to  undergo  multipath  propagation  where  the  delay 
spreads  are  not  necessarily  negligible. 

The  baseband-equivalent  mathematical  model  for  the  problem  under  consideration  is  that  of  a 
multiple-input  multiple-output  (MIMO)  system.  Such  modeling  of  digital  communication  systems 
has  received  considerable  attention  recently  in  a  variety  of  contexts  (other  than  SDMA)  [13]-[16]. 
A  major  limiting  factor  in  high  data  rate  (>  800  kbps)  digital  subscriber  lines  (DSL)  using  twisted¬ 
pair  wires  is  crosstalk  between  twisted  pairs  in  close  physical  proximity  [14].  In  [14]  the  entire 
cable  has  been  treated  as  a  single  MIMO  channel  with  the  crosstalk  characterized  by  the  matrix 
impulse  response  of  the  channel,  rather  than  as  additive  noise.  [14]  is  concerned  with  design 
of  linear  equalizers  for  suppression  of  near-  and  far-end  crosstalk  assuming  complete  knowledge 
of  the  MIMO  channel  matrix  transfer  function.  In  multi-track  digital  magnetic  recording  [17] 
MIMO  representation  is  needed  to  represent  crosstalk  arising  from  adjacent  tracks.  Even  in  a 
single-track  situation,  MIMO  models  may  arise  because  of  vector  stationary  process  modeling  of 
scalar  cyclostationary  signals  [13].  For  instance,  many  run-length  limited  codes  used  in  magnetic 
recording  give  rise  to  cyclostationary  sequences  [13].  Other  applications  include  dually  polarized 
radio  channels  [18]  and  multisensor  sonar/radar  systems  [19]. 

Some  of  the  recent  work  on  MIMO  channels  has  been  concerned  with  design  of  transmitter  and 
receiver  filters  [14]  and  [15],  and  MIMO  equalizers  for  suppression  of  intersymbol  interference  (ISI), 
cochannel  and  adjacent  channel  interferences  (CCI  and  ACI)  [13]  and  [16].  In  these  contributions 
complete  knowledge  of  the  MIMO  transfer  function  is  assumed  to  be  available.  As  noted  in  [16] 
(see  also  [17]),  the  equalizers  can  be  adapted  using  LMS  (least-mean  squares)  or  other  algorithms 
based  on  minimizing  the  mean  square  error  between  the  actual  response  (of  the  equalizer)  and  the 
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desired  response  which  is  typically  supplied  by  a  training  sequence.  In  case  of  MIMO  channels, 
the  training  sequence  has  to  be  a  vector  sequence  which,  in  turn,  implies  that  cochannel  and 
adjacent  channel  interferences  must  also  cooperate  in  furnishing  training  sequences.  Clearly  this 
is  unrealistic.  In  other  situations  even  the  transmitter  of  the  desired  signal  may  not  be  able  to 
transmit  a  training  sequence.  This  leads  to  the  desirability  of  adaptive  design  of  MIMO  equalizers 
and  channel  estimators  in  the  absence  of  any  training  sequences:  blind  channel  estimation  and 
equalization.  This  paper  is  concerned  with  exactly  this  problem. 

Past  work  on  blind  equalization  and/or  channel  estimation  has  been  overwhelmingly  concen¬ 
trated  upon  SISO  systems  (single  signal  single  channel  scenario  with  baud-rate  sampled  data). 
First  blind  adaptive  equalizer  was  proposed  by  Sato  [20].  This  work  was  followed  by  generaliza¬ 
tions  due  to  Godard  [21]  and  to  Benveniste  et  al.  [22].  The  CMA  (constant  modulus  algorithm) 
[23]  is  a  special  case  of  and  an  alternative  interpretation  of  the  Godard  family  of  equalizers.  Other 
contributions  to  SISO  systems  blind  equalization  problem  include  [25]-[26]  (and  references  therein). 
The  communications  channels  are,  in  general,  nonminimum-phase;  hence,  the  second-order  statis¬ 
tics  of  the  baud-rate-sampled  stationary  signals  are  inadequate  for  blind  channel  identification 
[22], [24], [27];  that  is,  in  general,  conventional  LMS  (least-mean  square)  scheme  will  not  work  in 
a  blind  setting  [27].  In  [l]-[2]  (and  others  [33])  it  is  proposed  to  use  fractional  sampling  and  to 
exploit  the  second-order  statistics  of  the  fractionally  sampled  data  which  are  cyclostationary.  It  has 
been  shown  in  [28]  that  for  a  class  of  multipath  channels,  the  approaches  of  [l]-[2]  will  be  unable 
to  correctly  identify  the  underlying  channel  transfer  function.  In  particular,  this  class  includes  all 
multipath  channels  consisting  of  time  delays  that  are  integer  multiples  of  the  symbol  duration.  No 
such  problems  arise  if  higher-order  statistics  of  the  data  are  also  exploited  [28]. 

Prior  work  on  blind  equalization  and/or  channel  estimation  for  truly  MIMO  systems  (more 
than  one  information  sequence)  has  been  far  less  extensive.  References  [3]-[7],  [9]-[12],  [23],  [29] 
and  [30]-[33]  (and  references  therein)  have  considered  this  problem  in  the  communications  context. 
In  an  interesting  paper  [12]  it  has  been  pointed  out  that  given  a  complex(-valued)  MIMO  channel 
and  consequently  a  complex  equalizer,  but  with  real-valued  signals  (sources  such  as  M-ary  PAM), 
the  CMA/Godard  cost  functions  (and  their  variations)  employed  in  [7],  [10],  [23],  [29],  [30]  and  [32] 
will  have  some  undesirable  global  minima  in  that  the  real  and  imaginary  parts  of  each  equalizer 
output  after  convergence,  may  correspond  to  different  user  signals.  This  would  then  necessitate 
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further  processing  to  check  for  and  correct  such  misconvergence.  In  a  mixed  source  scenario  where 
not  all  users  have  the  same  alphabet  (e.g.  2- PAM  and  4-QAM),  it  is  not  clear  how  one  would  detect 
such  a  misconvergence.  We  note  that  such  a  misconvergence  can  not  occur  for  the  cost  function 
considered  in  this  paper  (see  [5]  for  a  discussion  of  the  convergence  points  of  the  cost  used).  The 
approaches  of  [7],  [10],  [23],  [30]  and  [32]  are  restricted  to  symmetric  sources  with  negative  fourth 
cumulants.  In  this  paper  we  allow  the  sources  to  be  asymmetric  and  they  can  have  negative  or 
positive  fourth  cumulants. 

Finite  alphabet  property  (all  sources  have  the  same  alphabet)  has  been  used  in  [9]  where  de¬ 
lay  spread  has  been  assumed  to  be  negligible.  We  assume  neither  in  this  paper:  different  users 
are  allowed  to  have  different  alphabets  and  moreover,  some  “users”  may  actually  represent  non- 
Gaussian  non- communications  interferences.  Moreover,  we  consider  multipath  propagation  with 
non- negligible  delay  spreads.  In  [3]  a  subspace  approach  approach  has  been  used  whereas  in  [31]  a 
subspace  approach  coupled  with  the  finite  alphabet  property  (with  known  and  identical  alphabets) 
has  been  proposed.  The  subspace  approaches  of  [3]  and  [31]  require  the  MIMO  transfer  function 
J^{z)  (see  (2-2))  to  have  full  rank  for  every  z  including  z  =  oo  but  excluding  ar  =  0  whereas  in  this 
paper  we  only  require  .^(2)  to  have  full  rank  for  j^l  =  1;  see  Theorem  1  in  Sec.  3. 

In  [4], [5]  an  iterative,  inverse  filter  criteria  based  approach  has  been  developed  for  deconvolution 
of  multichannel  non- Gaussian  processes  using  the  fourth-order  normalized  cumulants  of  the  inverse 
filtered  data  at  zero-lag.  The  approach  is  input-iterative,  i.e.,  the  inputs  are  extracted  and  removed 
one-by-one.  The  matrix  impulse  response  is  then  obtained  by  cross-correlating  the  extracted  inputs 
with  the  observed  outputs.  In  this  paper  we  develop  a  stochastic  gradient-based  “recursification”  of 
all  of  the  batch  optimization  steps  in  [4],[5].  An  interesting  input-iterative  adaptive  approach  using 
prewhitened  observations  and  the  fourth-order  cumulant  of  the  inverse-filtered  data  at  zero-lag  has 
been  considered  in  [34]  and  [35].  The  inverse  filter  is  constrained  to  have  a  lossless  filter  structure 
which  is  realized  using  a  lossless  lattice  filter.  Such  a  restriction  can  lead  to  ill-conditioning  of  the 
algorithm  of  [34]  as  one  iteratively  extracts  input  sequences.  A  fix  to  this  is  proposed  in  [35]  but  it 
works  only  for  the  two-input  case.  Refs.  [34]  and  [35]  are  restricted  to  ‘square’  systems:  number  of 
inputs  (M)  equal  to  the  number  of  outputs  (N),  whereas  in  this  paper  we  allow  N  >M,&  common 
occurrence  in  array  processing.  Moreover,  in  this  paper  we  perform  no  prewhitening,  rather  we 
operate  directly  on  the  given  measurements.  A  consequence  of  this  is  that  the  ill-conditioning  of 
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[34], [35]  referred  to  above  does  not  occur  in  our  approach.  Refs.  [34]  and  [35]  are  restricted  to 
real-valued  data  whereas  we  also  consider  complex- valued  observations. 

The  paper  is  organized  as  follows.  In  Sec.  2  the  precise  model  assumptions  are  presented.  The 
inverse- filter  criteria-based  approach  of  [5],  the  underlying  identifiability  results  and  the  iterative 
source  separation  solution  of  [5]  are  briefly  discussed  in  Sec.  3.  In  Sec.  4  we  develop  a  stochastic 
gradient-based  “recursification”  of  all  of  the  batch  optimization  steps  discussed  in  Sec.  3.  Computer 
simulation  examples  are  presented  in  Sec.  5. 

2  Problem  Statement  and  Assumptions 

Consider  a  discrete-time  MIMO  system,  possibly  complex- valued,  with  N  outputs  and  M  inputs. 
The  i-th  component  of  the  output  at  time  k  is  given  by 
M 

yt(^)  =  Yl^ij{z)wj{k)  +  ni{k) ,  i  =  (2-1) 

i=i 

y(A;)  =  J^(z)w{k)  -I-  n{k),  (2-2) 

where  y(lc)  “  [l/i(^)  *  1/2(^)  *  *  *  *  •yiv(^)]^9  similarly  for  w(/c)  and  n(/s),  z  denote  both  the  backward- 
shift  operator  (i.e.,  z~^'w(^k^  —  w(^k  —  1),  etc.)  as  well  as  the  complex  variable  z  in  the  ^  transform, 
Wj{k)  is  the  y-th  input  at  sampling  time  k,  yi{k)  is  the  i-th  output,  ni{k)  is  the  additive  Gaussian 
measurement  noise  independent  of  {w(fc)},  and 

OO 

f  =  — CX) 

is  the  scalar  transfer  function  with  Wj{k)  as  the  input  and  yi{k)  as  the  output.  The  MIMO  transfer 
function  is  J^{z)  with  ij-th  element  J^ij{z).  The  model  (2-l)-(2-2)  is  the  space-time  baseband- 
equivalent  channel  model  used  by  several  authors  (e.g.  [3]-[7],  [10]-[12]  and  references  therein).  The 
above  model  could  be  the  result  of  baud-rate  sampling  of  continuous-time  signals  at  N  sensors,  or 
it  could  be  the  result  of  oversampling  (fractional  sampling)  at  fewer  than  N  sensors  [l]-[3]. 

The  following  assumptions  are  made  concerning  the  system  model: 

(ASl)  The  vector  sequence  {w(A:)}  is  zero-mean,  temporally  i.i.d.  (independent  and  identically 
distributed)  and  spatially  independent,  i.e.,  various  components  of  w(A:)  are  independent 
of  each  other  but  not  necessarily  identically  distributed.  Assume  that  the  fourth-order 
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cumulant  (see  (3-1)  later)  of  all  the  components  of  v/{k)  are  nonzero  but  not  necessarily 
negative. 

(AS2)  If  it  is  an  infinite  impulse  response  (TTh)  model,  then  (2-2)  is  assumed  to  be  the  result 
of  a  flnite-dimensiona.l  multichannel  ARMA  model  such  that  the  model  matrix  impulse 
response  function  is  exponentially  stable,  i.e.,  ||  [fijil)]  |1  <  for  some  0  <  a  <  oo 

and  0  <  /3  <  1  where  [fij{l)]  denotes  a  matrix  with  its  ij-th  element  as 

(ASS)  N  >  M,  i.e.  at  least  as  many  outputs  as  inputs. 

(AS4)  Rank{.;^(2;)}  =  M  for  any  j^l  =  1. 

Notice  that  we  allow  the  fourth-order  cumulants  of  some  components  of  w{k)  to  be  positive.  This 
implies  that  not  all  of  the  signals  impinging  upon  the  array  are  necessarily  digital  communications 
signals.  Moreover,  we  do  not  require  E{wj{k)y  =  0  if  the  component  Wj^k)  has  negative  fourth 
cumulant;  this  is  in  contrast  to  the  CMA/Godard  algorithm-based  approaches  where  we  also  must 
have  E{wj{k)}  =  0  in  addition  to  negative  fourth  cumulant  of  Wj{k).  The  objective  is  to  recover 
Wj{k)  Vy.  For  non-communications  signals  (interferences),  one  may  be  interested  in  analyzing  the 
sources  (direction,  for  instance)  of  such  interference.  One  may  not  know  in  advance  the  number  of 
such  interfering  sources,  consequently,  the  existing  methods  (such  as  [9]  and  [31])  that  exploit  the 
finite  alphabet  property  of  the  digital  communications  signals  to  simultaneously  extract  all  of  the 
sources  will  not  work  for  the  stated  problem. 

As  noted  earlier  in  Sec.  1,  it  has  been  pointed  out  in  [12]  that  for  complex  MIMO  channel- 
equaUzer  cascades,  but  with  real-valued  sources,  the  CMA/Godard  costs  will  have  some  undesirable 
global  minima.  “The  real  and  imaginary  parts  of  each  equalizer  output  after  convergence,  may 
correspond  to  different  user  signals”  [12].  It  has  been  shown  in  [12]  that  the  reason  for  this  is  that 
such  real- valued  signals  are  asymmetric  (i.e.  E{w^j{k)}  f  0).  Such  a  misconvergence  can  not  occur 
for  the  cost  function  (3-3)  considered  in  this  paper  [5]. 

3  An  Iterative  Solution  Based  on  Inverse-Filter  Criteria 

In  [4], [5]  an  iterative,  inverse  filter  criteria  based  approach  has  been  developed  for  deconvolution  of 
multichannel  non-Gaussian  processes  using  the  fourth-order  normalized  cumulants  of  the  inveise 
filtered  data  at  zero-lag.  The  approach  is  input-iterative,  i.e.,  the  inputs  are  extracted  and  removed 
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one-by-one.  The  matrix  impulse  response  is  then  obtained  by  cross-correlating  the  extracted  inputs 
with  the  observed  outputs.  In  this  paper  we  develop  a  stochastic  gradient-based  ‘  recursification 
of  all  of  the  batch  optimization  steps  in  [4], [5].  In  this  section  we  briefly  discuss  the  batch  (non¬ 
recursive)  approach  of  [4], [5];  its  adaptive  version  is  developed  in  Sec.  4. 

Let  CUM4(ty)  denote  the  fourth-order  cumulant  of  a  complex-valued  scalar  zero-mean  random 
variable  ty,  defined  as 

CUM4(ty)  ;=  cum4{i(;,w*,io,m*}  =  —  2[E{|'u;p}]^  -  \E{w^}\‘^  (3-1) 

where  *  denotes  complex  conjugation.  We  will  use  the  notation  74«;t  —  CUM4('iyi(A:))  and  — 
E{|ii;i(ifc)|^}.  Consider  an  1  X  iV  row- vector  polynomial  equalizer  (filter)  C^iz),  with  its  j-th  entry 
denoted  by  Cj{z),  operating  on  the  data  vector  y{k).  Let  the  equalizer  output  be  denoted  by  e{k): 

i-1 

Following  [4]  consider  maximization  of  the  cost 

_  |CUM4(e(fc))|  (3.3) 

-  [E{\e{k)\^W 

for  designing  a  linear  equalizer  to  recover  one  of  the  inputs.  It  is  shown  [4]  that  when  (3-3)  is 
maximized  w.r.t.  C(z),  then  (3-2)  reduces  to 

e{k)  =  dwjoik  -  ko), 

where  d  is  some  complex  constant,  ko  is  some  integer,  jo  indexes  some  input  out  of  the  given  M 
inputs,  i.e.,  the  equalizer  output  is  a  possibly  scaled  and  shifted  version  of  one  of  the  system  inputs. 
It  has  been  established  in  [5]  that  under  (AS1)-(AS4)  and  no  noise,  such  a  solution  exists  and  if 
doubly-infinite  equalizers  are  used,  then  all  locally  stable  stationary  points  of  the  given  cost  w.r.t. 
the  equalizer  coefficients  are  also  characterized  by  solutions  such  as  (3-4). 

An  source-iterative  solution  is  given  by: 

Step  1.  Maximize  (3-3)  w.r.t.  the  equalizer  C{z)  to  obtain  (3-4).  Let 

74, =  CUM4(e(A:))  =  CUM4(du;io(^))- 

Step  2.  Cross-correlate  {e(A:)}  (of  (3-4))  with  the  given  data  (2-2)  and  define  a  possibly  scaled 
and  shifted  estimate  of  fijo{r)  as 

.  E{yi{k)e*{k-T)}  (3.6) 

AioWl  -  E{le(A:)|2} 
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where  Fij{z)  =  E/=-oo  Consider  now  the  reconstructed  contribution  of  e{k) 

to  the  data  yi{k)  (i  =  1, 2,  •  •  • , N),  denoted  by  yi^ik): 

I 

Step  3.  Remove  the  above  contribution  from  the  data  to  define  the  outputs  of  a  MIMO  system 
with  N  outputs  and  M  -  1  inputs.  These  are  given  by 

y'iik)  :=  2/i(fc) 

Step  4.  If  M  >  1,  set  M  <-  M  -  1,  yi{k)  <-  yi{k),  and  go  back  to  Step  1,  else  quit. 

In  practice,  all  the  expectations  in  (3-6)  are  replaced  with  their  sample  averages  over  appropriate 
data  records. 


(3-9) 


It  has  been  shown  in  [4], [5]  that 

=  Z)  -  1), 

I 

i.e.,  we  have  decomposed  the  observations  at  the  various  sensors  into  its  independent  components: 
yij^{k)  in  (3-9)  represents  the  contribution  of  {wjo(^)}  sensor  achieving  blind  signal 

separation.  This  aspect  may  be  useful  in  isolation  and  analysis  of  non-communication  interfering 

signals. 

Remark  1.  It  has  been  shown  in  [5]  that  under  the  conditions  (AS1)-(AS4)  and  no  noise,  the 
proposed  iterative  approach  is  capable  of  blind  identification  of  a  MIMO  transfer  function  F{z) 
up  to  a  time-shift,  a  scaling  and  a  permutation  matrix  provided  that  we  allow  doubly-infinite 
equalizers.  That  is,  given  T{z),  we  end  up  with  a  A{z)  where  the  two  are  related  via 

A{z)  =  :r(z)DAP 

where  D  is  an  M  X  M  “time-shift”  diagonal  matrix  with  diagonal  entries  such  as  z~'^  (recall  (3-4)), 
A  is  an  M  X  M  diagonal  scaling  matrix,  and  P  is  an  M  x  M  permutation  matrix.  The  following 

result  has  been  proved  in  [5] 

Theorem  l[5]:  Given  the  model  (2-2)  such  that  n{k)  =  0  and  given  the  true  4th-order  and  2nd- 
order  cumulant  functions  of  the  model  output  {y{k)}  such  that  conditions  ( AS1)-(AS4)  hold  true. 
Suppose  that  doubly  infinite  equalizers  are  used  in  steps  1-4  of  the  iterative  procedure  of  Sec.  3. 
Then  this  procedure  yields  a  transfer  function  A(z)  satisfying  (3-10).  • 

Remark  2.  The  results  of  [4],[5]  are  based  upon  the  use  of  doubly-infinite  inverse  filters.  If  we 
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assume  that  has  finite  impulse  response  (FIR)  and  rank{J^(z)}  =  M  for  any  z  (including 

2  =  oo  but  excluding  z  =  0),  then  finite  length  inverse  filters  suffice.  For  an  analysis  and  further 
elaborations,  see  [6]  and  [7]  where  a  Godard  cost  function  is  considered  but  the  results  of  [6]  and 
[7]  can  be  easily  modified  to  apply  to  the  cost  (3-3)  and  the  basic  conclusions  remain  unchanged. 
The  following  result  follows  from  [5]  and  [7]. 

Theorem  2:  Given  the  FIR  model  (2-2)  such  that  n{k)  =  0  and  conditions  (ASl)  and  (AS4) 
hold  true.  Suppose  that  steps  1-4  of  the  iterative  procedure  of  Sec.  3  are  used  and  the  record  length 
tends  to  infinity.  Then  this  procedure  yields  a  transfer  function  A{z)  satisfying  (3-10)  if  one  of  the 
following  holds  true: 

(A)  Rank{.F(z)}  =  M  for  any  z  (including  z  =  oo  but  excluding  z  =  0),  and  doubly-infinite 
equalizers  are  used. 

(B)  Rank{J^(z)}  =  M  for  any  z  (including  z  =  oo  but  excluding  z  =  0),  J^{z)  is  column-reduced 
and  FIR  equalizers  with  length  >  (2M  —  l)Tc  “  1  used  where  Lc  —  channel  length. 

•  □ 

4  Adaptive  Algorithm 

In  this  section  we  develop  a  stochastic  gradient-based  “recursification”  of  all  of  the  batch  opti¬ 
mization  steps  discussed  in  Sec.  3.  Theorems  1  and  2  of  Sec.  3  motivate  and  justify  the  algorithm 
developed  in  this  section. 


4.1  First  Stage  Maximization  of  Normalized  Fourth  Cumulant 

Let  the  length  of  the  equalizer  C(z)  be  Le  and  let 
Le-\ 

1=0 

This  allows  us  to  rewrite  (3-2)  as 


®(^)  =  X)  X]  G(03/i(^  -  0  -  C^Y(fc) 


i=l  1=0 


where 


Y(fc)  = 


Y^{k)  Y?{k) 


(4-1) 


(4-2) 


(4-3) 
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II 

■jT 

yi{k)  yi{k  -  1)  •  •  •  yi{k  -  Tg  +  1)  J  , 

(4-4) 

C{k)  = 

cT  cj  ■■■  cj,]", 

(4-5) 

and 

Ci=[c 

■iT 

{(0)  ^2(1)  ‘  •  •  C{(i/e  —  1) 

(4-6) 

Define 

7714  =  jB{le(fc)|^},  7712  =  E{\e{k)\‘^},  m2  =  E{e^{k)}. 

Then  showing  explicit  dependence  upon  C,  (3-3)  may  be  rewritten  as 

(4-7) 

J(C)  = 

^  ^[77l4-|m2p  ^ 

[  ml 

(4-8) 

where 

74  =  7714  -  2  7712  -177121^. 

(4-9) 

Let  VC  denote  a  gradient  operator  (w.r.t.  a  vector  C).  We  will  follow  [7]  in  formally  defining 
the  complex  derivatives.  Then  we  have 

yQ.e(A:)  =  0  and  Vc*  =  Y*(A:).  (4-10) 

Using  the  above  results  in  (4-7)  we  have 

=  2E{e^{k)e*{k)Y*{k)},  Vc*”^2  =  ■£'{e(fc)Y*(A:)}  (4-11) 

and 

Vc*^2  =  0,  Vc*™2  =  2  £:{e*(/:)Y*(fc)}.  (4-12) 

Using  (4-8)-(4-12)  and  after  some  simplification,  we  have 
Vc*«^(C)  = 

^^gn^Ill  |m2E{|e(fc)pe(fc)Y*(fc)}  -  m2m2E{e*{k)Y*{k)}  -  [rrn  -  \m2\‘^]E{e{k)Y*{k)}'j  .(4-13) 

ml  1 
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We  will  use  a  stochastic  gradient  method  for  recursification  of  maximization  of  J(C)  using  an 
‘instantaneous’  gradient  as  an  estimate  of  (4-13).  Given  the  estimate  C{k  —  1)  of  the  tap-gains  at 
time  A:  —  1.  the  stochastic  gradient  method  computes  the  update  C(fc)  at  time  k  as 


C{k)  =  C{k  -1)  +  fj-i  Vc*  •^fc(C(A:  -  1)) 

ak)  = 

\\c{k)\\ 


(4-14) 

(4-15) 


where  fii  is  the  update  step-size  and  VC*‘^Jt(*^(^  ~  ^))  instantaneous  gradient  of  the  cost  J 
(w.r.t.  C*)  at  time  k  evaluated  at  C(A:  -  1).  Since  the  cost  J  is  invariant  any  scaling  of  C,  we 
normalize  C  in  (4-15)  to  have  a  unit  norm.  From  (4-13)  we  have  the  approximation 

VC*«^fc(C(fc))  =  sgn(74fc)-4-  {[’Tizfc  -  m2fe)  e*{k)  -  (m4fe  -  \rh2kf)  e{k)\  Y*(A:)}  (4-16) 
where 


m2k  =  (1  -  M2)^2(A:-1)  +  M2k(^)Pj 

fh2k  =  (1  -  M2)^2(fc-1)  +  l^2e^{k), 

T^ik  =  (1  -  A‘2)”^4(fc-i)  +  M2|e(A:)|‘^, 

74fc  =  ‘rriik  -  2m2fc  -  |m2fc|^ 


(4-17) 

(4-18) 

(4-19) 

(4-20) 


and 

e{k)  =  C'^{k)Y{k).  (^^-21) 

In  (4-17)-(4-19)  the  various  quantities  represent  estimates  based  upon  sample  averaging,  the  (ex¬ 
ponential  window)  memory  being  controlled  by  the  forgetting  factor  fi2  {0  <  fi2  <  !)•  The  initial¬ 
izations  for  (4-17)-(4-19)  are;  mao  =  =  ^20  =  0. 


4.2  First  Stage  Signal  Cancellation 

Now  we  discuss  implementation  of  (3-6)  via  sample  averaging  using  an  exponential  window  con¬ 
trolled  by  a  forgetting  factor  [13.  Define  (ii,  L2  >  0) 

Eik)  =  [  e{k  +  Li)  e{k  +  Li-l)  •••  e(A:  -  £2)  ] 
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and 

Fi=[7i(-ii)  7(-ii  +  l)  •••  IiiL2)f' ■  (4-23) 

By  Sec.  3,  when  (4-8)  is  maximized,  e{k)  satisfies  (3-4)  so  that  for  suitable  choice  of  Li  and  L2, 
there  exists  a  jo  G  {1, 2,  •  •  • ,  M}  such  that 

i  =  l, 2, MU 

I 

In  order  to  implement  (3-7)  and  (3-8),  we  need  recursive  estimates  of  F^.  The  estimate  Fi{k)  of  Fi 
at  time  k  is  provided  by 

Fi{k)  =  Ri{k)/m^e{k)  {4-25) 

where 

rrieeik)  =  (1  -  fj.3)meeik  -  1)  +  /i3|e(A:)P,  (4-26) 

R,(fc)  =  (1  -  fX2)Riik  -  1)  +  f^3yiik)E*{k).  (4-27) 

4.3  Multistage  Algorithm 


In  Secs.  4.1  and  4.2  we  discussed  the  first  stage  of  the  algorithm  where  we  have  N  sensors  and  M 
sources.  Now  we  put  it  all  together  following  the  source-iterative  solution  of  Sec.  3  and  discuss 
extraction  of  M  sources  including  the  cancellation  of  the  extracted  sources.  We  will  use  the  super¬ 
script  (m)  to  denote  the  various  quantities  pertaining  to  stage  m.  These  have  been  used  previously 
in  Secs.  4.1  and  4.2  without  this  superscript;  for  instance,  C^^\k)  now  denotes  the  estimate  of  the 
tap-gain  vector  at  time  k  at  stage  m,  etc. 


Initialization: 


Y(^)(A:)  = 

as  in  (4-3) 

(4-28) 

DO  FOR  m  =  1,2,  - 

C("')(jfc)  = 

C(”*)(A:  -i)  +  Hi  vc.  jj^\c^^'>{k  -  1)) 

(4-29) 

c(™)(fc)  = 

C(’")(A:) 

(4-30) 

V-/  yiv  j 

11C(-)(A:)|1 
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where 


’^2fc 


(4-31) 

=  (1- 

(4-32) 

=  (1- 

(4-33) 

=  (1- 

(4-34) 

7Sr’  = 

(4-35) 

and 

(4-36) 

Set 

S<“>(4)  =  : 

fS’"^^(A;)E(”‘)(A:) 

(4-37) 

where  represents  (cf.  (3-7))  the  contribution  of  the  extracted  source 

stage  to  the  measurement  at  time  k  at  the  i— th  sensor,  and  where 

at  the  m— th 

F\^\k)  = 

R(-)(A:)/m(r)(A:), 

(4-38) 

= 

(l-M3)mlr)(A:-l)-H/i3|e("‘)(A:)l^ 

(4-39) 

E(’")(fc)  = 

e(”‘)(A:  +  Li)  e(”‘)(fc  -b  ii  -  1)  •  •  •  e^”^\k  -  ij) 

(4-40) 

and 

II 

? 

(1  -  fM^)K^r\k  -  1)  + 

(4-41) 
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Define 


r(m) 


(k)  = 


^m) 


(k) 


-  1) 


^^\k-L,  +  l) 


Set 

Y^''^+^\k)  =  [  Y}^'^'^'>^{k)  ■  •  •  (k) 

where 


(4-42) 

(4-43) 

(4-44) 


ENDDO 

The  sequence  {e(’”)(ifc)}  in  (4-36)  represents  the  equalized  (up  to  a  scale  factor  and  time  delay) 
source  at  stage  m. 


Remark  3.  If  M  were  unknown  the  proposed  approach  will  still  work  in  the  sense  that  if  M  were 
underestimated,  some  sources  will  be  missed  but  the  extracted  sources  will  correspond  to  one  of 
the  users  (or  interferers).  If  M  were  overestimated,  all  the  users/interferers  will  be  recovered  in 
addition  to  some  “meaningless  junk”  outputs  in  stages  Mq  +  1  and  later  where  Mq  denotes  true 
j^uj^ljer  of  users.  Indeed  one  can  test  the  ^residuals  (4-44)  (see  also  (3-8))  to  check  if  any  significant 
non-Gaussian  components  remain  in  the  data  before  implementing  another  equalizer  in  parallel. 
We  do  not  pursue  this  aspect  in  this  paper.  □ 


Running  Cost.  To  monitor  the  convergence  of  the  equalizers  in  various  stages  of  the  algorithm, 
it  is  useful  to  calculate  a  running  cost  (4-8)  without  the  sign.  Let  denote  the  running  cost  for 
the  m-th  stage  at  time  k,  given  by 


r(’") 


m 


(”*)  _  l75i(”*)|2 


•4fc 


•2k 


(m)2 

^2k 


-2 


(4-45) 


where 

ftW  =  (1  (“■«) 

and 

=  (1  -  /■4)".ir4-l)  +  (‘4|e<'">(fc)l‘-  (“-«) 

For  all  of  the  simulations  presented  in  Sec.  5,  we  took  =  0.002. 
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5  Simulation  Examples 

In  this  section  we  provide  three  computer  simulation  examples  to  illustrate  the  proposed  blind 
adaptive  algorithm  for  multiuser  signal  separation  and  interference  suppression. 

5.1  Example  1:  3  BPSK  Sources  and  7  Sensors 

We  consider  a  wireless  communications  scenario  with  three  (M  =  3)  BPSK  user  signals  arriving 
at  a  uniform  linear  array  of  JV  =  7  sensors  via  a  frequency  selective  multipath  channel.  The  array 
elements  are  spaced  half  a  wavelength  apart.  The  array  measurements  are  assumed  to  be  sampled 
at  baud  rate  (for  simulation  convenience  only)  with  sampling  interval  T  seconds  and  the  three 
sources  have  the  same  baud  rate.  The  relative  time  delay  r  (relative  to  the  first  arrival),  the  angle 
of  arrival  6  (in  degrees  w.r.t.  the  array  broadside)  and  the  relative  attenuation  factor  (amplitude) 
a  for  various  sources  were  selected  as: 

tni  :  (r,0,a)  =  (OT,  10°, 0.5),  (IT, 50°,  0.75) 

W2  :  (r,0,a)  =  (OT, -20°,  0.5),  (IT,  45°, 0.45),  (2T,  15°, -0.65) 
wz  :  (r,0,a)  =  (OT, -35°,  0.7),  (IT, -5°,  0.4). 

Thus  the  signals  wi  and  103  propagate  through  two  paths  whereas  •u;2  passes  through  three  paths. 
The  signals  arriving  at  the  array  were  normalized  such  that  the  signal  powers  for  users  1  and  2 
are  equal,  and  3dB  higher  than  the  signal  power  for  user  3.  Additive  white  (both  temporally  and 
spatially)  Gaussian  noise  was  added  to  the  array  measurements  to  achieve  a  signal-to-noise-ratio 
(SNR)  of  11.55dB  (ratio  =  100/7)  for  the  strongest  user(s).  The  SNR  for  a  given  user  Wj{k)  is 
defined  as 

-  £{K(t)P} 

The  proposed  approach  was  applied  with  Af  =  3  equalizers  and  M  —  1  =  2  signal  cancellers  run¬ 
ning  in  parallel,  each  successive  equalizer  put  in  operation  after  waiting  for  200  samples  (symbols) 
w.r.t.  the  previous  stage.  The  equalizer  length  was  chosen  to  be  5  taps  per  sensor  (ie  5  in  (4-2)). 
The  initial  guess  for  the  tap  gains  was  taken  to  be  center-tap  initialization:  set  Ci(2)  =  1  for  i  —  m 
for  the  m— th  stage  equalizer  (m  =  1, 2, 3)  with  the  remaining  tap  gains  set  to  zero.  The  algorithm 
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step  sizes  and  forgetting  factors  for  each  stage  m  were  chosen  as:  =  0.003  in  (4-29),  ^2  =  0.015 

in  (4-32)-(4-34)  and  fj.3  =  0.0005  in  (4-39)  and  (4-41).  For  the  running  cost  (4-45)  computation 
we  selected  Hi  =  0.002  in  (4-46)-(4-48).  The  parameters  h  and  L2  in  (4-40)  (see  also  (4-22)  and 
(4-23))  were  selected  as  Li  =  15  and  1-2  =  6. 

Fig.  1  shows  the  evolution  of  the  average  running  cost  (see  (4-45)),  averaged  over  100 
Monte  Carlo  runs  after  ‘assigning’  each  equalizer  cost  to  its  corresponding  extracted  source.  For 
BPSK  sources  the  4th-order  normalized  cumulant  equals  -2;  therefore,  at  convergence,  the  running 
cost  (4-45)  should  be  close  to  -2.  In  Fig.  1  we  see  these  values  to  be  around  -1.89  which  is  largely  a 
consequence  of  noise  in  the  data  which  affects  only  the  denominator  of  (4-45)  making  it  larger  than 
it  should  be.  Table  1  shows  the  signal-to-interference-and-noise  ratio  (SINR)  and  the  probability 
of  error  Pg  at  the  output  of  each  equabzer  at  selected  time  instants,  averaged  over  100  Monte  Carlo 
runs  and  3000  symbols.  [The  equalizer  tap  gains  at  the  chosen  time  instants  were  ‘frozen’  and  used 
to  equalize  data  of  length  3000  symbols  in  order  to  calculate  SINR  and  Pg-  The  equalized  data 
were  rotated,  scaled  and  shifted  before  calculating  the  two  performance  measures.]  It  is  seen  from 
Fig.  1  and  Table  1  that  the  proposed  approach  works  well.  As  noted  earlier,  [12]  has  shown  that 
CM  A/Godard  cost  functions  will  have  problems  with  the  user  signals  considered  in  this  example. 

5.2  Example  2:  3  4-QAM  Sources  and  7  Sensors 

This  example  is  the  same  as  Example  1  except  that  the  three  user  signals  are  4-QAM.  The  other 
parameters  for  signal  generation  and  equalization  are  just  as  for  Example  1  (e.g.  user  signals  1 
and  2  are  3  dB  stronger  than  the  user  signal  3,  etc.).  The  counterparts  to  Fig.  1  and  Table  1 
are  now  shown  in  Fig.  2  and  Table  2,  respectively.  For  4-QAM  sources  the  4th-order  normalized 
cumulant  equals  -1;  therefore,  at  convergence,  the  running  cost  (4-43)  should  be  close  to  -1.  The 
convergence  is  now  slower,  yet  the  approach  still  works  well.  The  weaker  user  signal  now  takes 
longer  to  be  extracted. 

5.3  Example  3:  2  Mixed  Sources  and  5  Sensors 

In  this  example  we  consider  a  4-QAM  user  signal  wi  (4th  normalized  cumulant  as  —1)  and  a 
non-communications  signal  'W2  consisting  of  an  i.i.d.  complex  Gaussian-mixture  (independent  and 
identically  distributed  real  and  imaginary  parts  with  the  real  part  being  A/'(0,1)  with  probability 
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0.9  and  .A/’(0,4)  with  probability  0.1)  with  4th  normalized  cumulant  as  0.7433  .  The  multipath 
channels  for  the  two  signals  were  selected  as: 

:  (r,0,a)  =  (OT,  10°,  0.5),  (IT,  .50°,  0.75) 

W2  :  (r,0.a)  =  (OT, -20°, 0.5),  (IT,  45°,  0.45),  (2r,  15°, -0.65). 

The  two  signals  have  equal  power.  Additive  white  Gaussian  noise  was  added  to  the  array  measure¬ 
ments  to  achieve  an  SNR  of  13dB  (ratio  =  20)  for  each  user  signal. 

The  proposed  approach  was  applied  with  M  =  2  equalizers  and  M  —  1  —  1  signal  cancellers 
running  in  parallel,  the  second  equalizer  put  in  operation  after  waiting  for  200  samples.  The 
equalizer  length  was  chosen  to  be  5  taps  per  sensor  (ie  =  5  in  (4-2)).  The  initial  guess  for  the  tap 
gains  was  taken  to  be  center-tap  initialization:  set  Ci(2)  =  1  for  i  =  m  for  the  m-th  stage  equalizer 
(m  =  1,2)  with  the  remaining  tap  gains  set  to  zero.  The  algorithm  step  sizes  and  forgetting  factors 
were  chosen  as:  m  =  0.0005  in  (4-29),  fn  =  0.015  in  (4-32)-(4-34)  and  fiz  =  0.0005  in  (4-39) 
and  (4-41)  when  <  0  (see  (4-35)),  and  /xi  =  0.0001  in  (4-29),  fJ.2  =  0.003  in  (4-32)-(4-34) 
and  fi3  =  0.0005  in  (4-39)  and  (4-41)  when  7^” ^  >  0.  The  parameters  Li  and  L2  in  (4-40)  were 
selected  as  h  =  15  and  L2  =  6.  For  the  running  cost  (4-45)  computation  we  selected  =  0.002 
in  (4-46)-(4-48). 

The  counterparts  to  Fig.  1  and  Table  1  are  now  shown  in  Fig.  3  and  Table  3,  respectively,  where 
in  Table  3  the  Pg  for  signal  2  is  omitted  (for  obvious  reasons).  The  convergence  for  the  source  with 
positive  4th  cumulant  is  quite  slow. 

6  Conclusions 

The  problem  of  separating  multiple  signals  (including  possibly  non-digital  communications  inter¬ 
ferences)  received  at  an  antenna  array  in  a  wireless  communications  system  was  considered  in  the 
absence  of  any  training  sequences.  The  signals  are  allowed  to  undergo  multipath  propagation  where 
the  delay  spreads  are  not  necessarily  negligible.  In  [4], [5]  an  iterative,  inverse  filter  criteria  based 
approach  has  been  developed  for  deconvolution  of  multichannel  non-Gaussian  processes  using  the 
fourth-order  normalized  cumulants  of  the  inverse  filtered  data  at  zero-lag.  The  approach  is  input- 
iterative,  i.e.,  the  inputs  are  extracted  and  removed  one-by-one.  The  matrix  impulse  response  is 
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then  obtained  by  cross-correlating  the  extracted  inputs  with  the  observed  outputs.  In  this  paper  we 
developed  a  stochastic  gradient -based  recursification  of  all  of  the  batch  optimization  steps  in  [4], [5]. 
The  proposed  blind  adaptive  algorithm  was  illustrated  via  three  simulation  examples  involving 
frequency  selective  multipath  channels. 

It  has  been  pointed  out  in  [12]  that  for  complex  MIMO  channel-equalizer  cascades,  but  with 
real- valued  sources,  the  CMA/Godard  costs  will  have  some  undesirable  global  minima  in  that  the 
real  and  imaginary  parts  of  each  equalizer  output  after  convergence,  may  correspond  to  different 
user  signals.  It  has  been  shown  in  [12]  that  the  reason  for  this  is  that  such  real-valued  signals 
are  asymmetric  (i.e.  jE''{'io^(fc)}  ^  0).  Such  a  misconvergence  can  not  occur  for  the  cost  function 
considered  in  this  paper. 
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Table  2:  Example  2:  Performance  measures  at  selected  times:  averages  over  100  Monte  Carlo  runs 


^  of  samples 


User  1 


User  z 


user  6 


SINR(dB) 

Pe 

SINR(dB) 

Pe 

SINR(dB) 

Pe 

4000 

12.36 

0.0942 

12.78 

0.0483 

13.75 

0.0011 

6000 

14.36 

0.0292 

14.54 

0.0073 

14.70 

<  3  X  10-'* 

8000 

15.06 

<  3  X  lO-'* 

14.80 

<  3  X  10-'* 

14.93 

<  3  X  10-'* 

12000 

15.17 

<  3  X  lO"'* 

14.98 

<  3  X  10-'* 

14.98 

<  3  X  10-'* 

Table  5:  Example  3:  Performance  measures  at  selected  times:  averages  over  100  Monte  Carlo  runs 


of  samples 


User  1 


SINR(dB) 


User  2 
SINR(dB) 


8000 

12000 

ifiono 


14.12  0.0072 

14.79  <  3  X  10' 

15.06  <  3  X  10- 


running  cost 


7  sensors,  3  BPSK  sources 


(avergaed  over  100  runs) 


Figure  1;  Average  running  cost  for  Example  1,  averaged  over  100  Monte  Carlo  runs. 
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running  cost 


7  sensors,  3  4-QAM  sources 

(avergaed  over  100  runs) 


Figure  2:  Average  running  cost  for  Example  2,  averaged  over  100  Monte  Carlo  runs. 
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5  sensors,  2  ‘mixed’  sources 

(avergaed  over  100  runs) 


Figure  3:  Average  running  cost  for  Example  3,  averaged  over  100  Monte  Carlo  runs 
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ABSTRACT 

This  paper  is  concerned  with  the  problem  of  adaptive  de- 
convolution  and  estimation  of  the  matrix  impulse  response 
function  of  a  multiple-input  multiple-output  system  given 
only  the  measurements  of  the  vector  output  of  the  sys¬ 
tem.  The  system  is  assumed  to  be  driven  by  a  spatidly 
and  temporally  i.i.d.  non-Gaussian  vector  sequence  (which 
is  not  observed).  Recently  a  iterative,  inverse  filter  criteria 
based  approach  was  developed  using  the  third-order  and/or 
fourth-order  normalized  cumulants  of  the  inverse  filtered 
data  at  zero-lag.  The  approach  was  input-iterative,  i.e.,  the 
inputs  were  extracted  and  removed  one-by-one.  The  matrix 
impulse  response  was  then  obtained  by  cross-correlating  the 
extracted  inputs  with  the  observed  outputs.  In  this  paper 
an  adaptive  implementation  of  the  above  approach  is  devel¬ 
oped  using  a  stochastic  gradient  approach.  Simulation  ex¬ 
amples  are  presented  to  illustrate  the  proposed  approach. 

1.  INTRODUCTION 

Consider  a  discrete-time  MIMO  system,  possibly  complex¬ 
valued,  with  N  outputs  and  M  inputs.  The  t-th  component 
of  the  output  at  time  k  is  given  by 

M 

yi{k)  =  Y^:Fiiiz)wj{k)  +  ni{k),  i  = 

=»  y{k)  =  T{z)w{k)  -h  n{k),  (1  -  2) 

where  y{k)  =  [j(i(A;)  :  y2{k) :  •  •  •  :yN{k)f,  similarly  for  w(A;) 
and  n(fc),  z~^  denote  both  the  backward-shift  operator  (i.e., 
z'^^w{k)  =  w{k  —  1),  etc.)  as  weU  as  the  complex  vari¬ 
able  z  in  the  ^-transform,  Wj{k)  is  the  j-th  input  at  sarn- 
pling  time  fc,  2/t(A;)  is  the  t-th  output,  ni(k)  is  the  addi¬ 
tive  Gaussian  measurement  noise  indepenaent  of  {w(fc)}, 
and  scalar  transfer  func¬ 

tion  with  Wj(^k)  as  the  input  and  yi{k^  as  the  output. 
The  MIMO  transfer  function  is  T{z)  with  ij—ih.  element 
^ijlz).  The  model  (l-l)-(l-2)  is  the  space-time  baseband- 
equivalent  channel  model  used  by  several  authors  (e.g.  [3]- 
[7]j  [9]-[l0]  and  references  therein).  The  above  model  could 
be  the  result  of  baud-rate  sampling  of  continuous-time  sig¬ 
nals  at  N  sensors,  or  it  could  be  the  result  of  oversampling 
(fractional  sampling)  at  fewer  than  JV*  sensors  [l]-[3]. 

In.  [4], [5]  an  iterative,  inverse  filter  criteria  b^ed  approach 
has  been  developed  for  deconvolution  of  multichannel  non- 
Gaussian  processes  using  the  fourth-order  normalized  cu¬ 
mulants  of  the  inverse  filtered  data  at  zero-lag.  The  ap¬ 
proach  is  input-iterative,  i.e.,  the  inputs  are  extracted  and 
removed  one-by-one.  The  matrix  impulse  response  is  then 
obtained  by  cross-correlating  the  extracted  inputs  with  the 
observed  outputs.  In  this  paper  we  develop  a  stochastic 
gradient- based  **recursification”  of  all  of  the  batch  optimiza¬ 
tion  steps  in  [4], [5].  An  interesting  input-iterative  adaptive 
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approach  using  prewhitened  observations  and  the  fourth- 
order  cumulant  of  the  inverse-filtered  data  at  zero-lag  has 
been  considered  in  [11]  and  [11].  The  inverse  filter  is  con¬ 
strained  to  have  a  lossless  filter  structure  which  is  realized 
using  a  lossless  lattice  filter.  Such  a  restriction  can  lead  to 
HI- conditioning  of  the  algorithm  of  [11]  as  one  iteratively 
extracts  input  sequences.  A  fix  to  this  is  proposed  in  12 
but  it  works  only  for  the  two-input  case.  Refs.  [11]  and  12 
are  restricted  to  M  =  N  whereas  in  this  paper  we  allow 
iV  >  Af,  a  common  occurrence  in  array  processing.  More¬ 
over,  in  this  paper  we  perform  no  prewhitening,  rather  we 
operate  directly  on  the  given  measurements. 

2.  MODEL  ASSUMPTIONS 

The  following  assumptions  are  made  concerning  the  system 
model  (1-1)  and  (1-2): 

(ASl)  The  vector  sequence  {w(A;)}  is  zero-mean,  tem¬ 
porally  i.i.d.  (independent  and  identically  dis¬ 
tributed)  and  spatially  independent,  i.e.,  various 
components  of  w(/i:)  are  independent  of  each  other 
but  not  necessarily  identically  distributed.  Assume 
that  the  fourth-order  cumulant  (see  (3-1)  later)  of 
all  the  components  of  w(A:)  are  nonzero  but  not 
necessarily  negative. 

(AS 2)  If  it  is  an  infinite  impulse  response  (HR)  model, 
then  (1-2)  is  assumed  to  be  the  result  of  a  finite- 
dimensional  multichannel  ARMA  model  such  that 
the  model  matrix  impulse  response  function  is  ex¬ 
ponentially  stable,  i.e.,  ||  [/ij(^i)]  {|  <  for  some 

0  <  a  <  oo  and  0  <  /?  <  1  where  [fii(l)]  denotes  a 
matrix  with  its  ij-th  element  as  fiM 
(AS3)  N  >  Mj  i.e.  at  least  as  many  outputs  as  inputs. 
(AS4)  Rank{.F(z)}  =  M  for  any  \z\  —  1. 

Notice  that  we  allow  the  fourth-order  cumulants  of 
some  components  of  w(/s)  to  be  positive.  Moreover,  we 
do  not  require  E{w^{k)}  =  0  if  the  component  Wj{k) 
has  negative  fourth  cumulant;  this  is  in  contrast  to  the 
CMA/Godard  algorithm-based  approaches  where  we  also 
must  have  E{wj{k)}  =  0  in  addition  to  negative  fourth 
cumulant  of  Wj{k).  The  objective  is  to  recover  '«^i(fc)  Vj. 
It  has  been  pointed  out  in  [9]  that  for  complex  MIMO 
channel-equalizer  cascades,  but  with  real-valued  sources, 
the  CMA/Godard  costs  wOl  have  some  undesirable  global 
minima.  “The  real  and  imaginary  parts  of  each  equalizer 
output  after  convergence,  may  correspond  to  different  user 
signals”  [9].  It  has  been  shown  in  [9]  that  the  reason  for 
this  is  that  such  real- valued  signals  are  asymmetric  (i.e. 
E{wj{k)}  ^  0).  Such  a  misconvergence  can  not  occur  for 
the  cost  function  (3-3)  considered  in  this  paper  [5]. 

3,  AN  ITERATIVE  SOLUTION 

In  this  section  we  briefly  discuss  the  batch  (non-recursive) 
approach  of  [4], [5];  its  adaptive  version  is  developed  in  Sec. 
4.  Let  CUM4(iu;  denote  the  fourth-order  cumulant  of  a 
complex- valued  scalar  zero-mean  random  variable  tu,  de¬ 
fined  as 

CUM4(to)  :=  -  2[E{\wf}f  -  \E{w^}\\  (3  -  1) 


Consider  an  1  x  iV"  row-vector  polynomial  equalizer  (filter) 
with  its  j-th  entry  denoted  by  C^{z),  operating  on 
the  data  vector  y{k).  Let  the  equalizer  output  be  denoted 
by  e{k): 

JV 

e{k)  =  Y,Ci{z)yi{k).  (3-2) 

i=l 

Following  [4]  consider  maximization  of  the  cost 

|CUM,(e(»))|  ,  , 

''  ■■  PIRimF  '  ' 

for  designing  a  linear  equalizer  to  recover  one  of  the  inputs. 
It  is  shown  [4]  that  when  (3-3)  is  maximized  w.r.t.  C(z), 
then  (3-2)  reduces  to 

e{k)  =  dwj^{k  -  fco),  (3  -  4) 


Sec.  3.  Then  this  procedure  yields  a  transfer  function  A(z) 
satisfying 

A{z)  =  :F(z)DAP  •  (3-9) 

The  results  of  [4], [5]  are  based  upon  the  use  of  doubly- 
infinite  inverse  filters.  If  we  assume  that  has  finite 

impulse  response  fFIR)  and  rank{.F(z)}  =  M  for  any  z 
(including  z  =  oo  but  excluding  z  =  0),  then  finite  length 
inverse  filters  suffice.  For  an  analysis  and  further  elabora¬ 
tions,  see  [6]  and  [7]  where  a  Godard  cost  function  is  con¬ 
sidered  but  the  results  of  [6]  and  [7]  can  be  easily  modified 
to  apply  to  the  cost  (3-3).  The  following  result  follows  from 
[5]  and  [7]. 

Theorem  2:  Given  the  FIR  model  (1-2)  such  that  n(fc)  =  0 
and  conditions  (ASl)  and  (AS4)  hold  true.  Suppose  that 
steps  1-4  of  the  iterative  procedure  of  Sec.  3  are  used  and 
the  record  length  tends  to  infinity.  Then  this  procedure 
yields  a  transfer  function  A{z)  satisfying  (3-9)  if  one  of  the 
following  holds  true: 


where  d  is  some  complex  constant,  ko  is  some  integer,  jo 
indexes  some  input  out  of  the  given  M  inputs,  i.e.,  the 
equalizer  output  is  a  possibly  scaled  and  shifted  version  of 
one  of  the  system  inputs.  It  has  been  established  in  [5]  that 
under  (ASl)-(AS4)  and  no  noise,  such  a  solution  exists 
and  if  doubly-infinite  equalizers  are  used,  then  zlU  loc^y 
stable  stationary  points  of  the  given  cost  w.r.t.  the  equalizer 
coefficients  are  ^so  characterized  by  solutions  such  as  (3-4). 
A  source-iterative  solution  is  given  by: 

Step  1.  Maximize  (3-3)  w.r.t.  the  equalizer  C(z)  to  obtain 
(3-4). 

Step  2.  Cross-correlate  {e{k)}  (of  (^3-4))  with  the  given 
data  (2-2)  and  define  a  possibly  scaled  and  shifted 
estimate  of  fijoir)  as 


T  .  V  E{yi(k)e*{k  -  t)} 


(3-5) 


(A)  Rank{.F(z)}  =  M  for  any  z  (including  z  =  oo  but 
excluding  z  =  O),  and  doubly-infinite  equalizers  are 
used. 

(B)  Rank{.F(z)}  =  M  for  any  z  (including  z  =  oo  but 
excluding  z  =  0),  ^{z)  is  column-reduced  and  FIR 
equalizers  with  length  Le  >  (2M  —  l)Lc  —  1  are  used 
where  Lc  =  channel  length.  • 

4.  ADAPTIVE  ALGORITHM 

In  this  section  we  develop  a  stochastic  gradient-based  “re- 
cursification”  of  all  of  the  batch  optimization  steps  dis¬ 
cussed  in  Sec.  3.  Theorems  1  and  2  of  Sec.  3  motivate 
and  justify  the  algorithm  developed  in  this  section. 

4.1,  First  Stage  Maximization  of  Normalized 
Fourth  Cumulant 

Let  the  length  of  the  equalizer  C(z)  be  Le  and  let 


where  Fij{z)  =  '•  Consider  now 

the  reconstructed  contribution  of  e(A;)  to  the  data 
yi{k)  (t  =  1,  2,  •  •  • ,  N),  denoted  by  %,j^{k): 

(3-6) 

i 

Step  3.  Remove  the  above  contribution  from  the  data  to 
define  the  outputs  of  a  MIMO  system  with  N  out¬ 
puts  and  M  —  1  inputs.  These  are  given  by 

yi{k)  :=  yi[k)  -  yijoif^)-  (3  -  V 

Step  4.  1{  M  >  1,  set  M  M  -  1,  yi(k)  y'i(k),  and  go 
back  to  Step  1,  else  quit. 

In  practice,  aU  the  expectations  in  (3-5)  are  replaced  with 
their  sample  averages  over  appropriate  data  records. 

It  has  been  shown  in  [4], [5]  that 

yijo(k)  =  X]  /iio('Ho(*  -  0.  (3  -  8) 

I 

i.e.,  we  have  decomposed  the  observations^at  the  various 
sensors  into  its  independent  components:  ViJoW 
represents  the  contribution  of  {wjoW}  lo  the  i-th  sensor 
achieving  blind  signal  separation. 

Theorem  l[5]:  Given  the  model  (1-2)  such  that  n{k)  =  0 
and  given  the  true  4th-order  and  2nd-order  cumulant  func¬ 
tions  of  the  model  output  {y{k)}  such  that  conditions 
(AS1)-(AS4)  hold  true.  Suppose  that  doubly  infinite 
equalizers  are  used  in  steps  1—4  of  the  iterative  procedure  of 


Z/«-l 

C;(z)  =  X 

1=0 

This  allows  us  to  rewrite  (3-2)  as 


N  L.-l 


e^k)  =  X  E 

(4-2) 

i=l  1=0 

where 

Y(fc)=  [  yi’’(fc)  Y2^{k)  YS{k)f, 

(4-3) 

Yi{k)  =  [  yiik)  yiik  -  1)  •  •  •  yiik  -  X.  +  1)  f  , 

(4-4) 

C{k)  =  [  Cl  C2  •  •  •  Cji  ]  , 

(4-5) 

and 

Ci^[Ci{0)  Ci{l)  Ci{Le-l)f. 

(4-6) 

Define 

m4  =  £:{|e(fc)|‘}.  m2  =  £{|e(fc)|"},  m2  =  EifW}- 

Then  showing  explicit  dependence  upon  C,  (3-3)  may  be 
rewritten  as 

J(C)  -  sgn(74)  Jj  '  2 

(4-8) 

where 


74  =  7714  —  2  m2  —  (^  “  9) 

Let  Vc  ^^^ote  a  gradient  operator  (w.r.t.  a  vector  C). 
We  will  follow  [7]  in  formally  defining  the  complex  deriva¬ 
tives.  Then  we  have 

Vc.e(*)  =  0  and  Vc- =  Y*(fe).  (4-10) 

Using  the  above  results  in  (4-7)  we  have 

Vc.^4  =2£{e=(i:)e*(fc)Y‘(fc)}, 

VC-  ^2  =  E{e{k)V{k)}  (4  -  11) 

and 

m2  =  0,  Vc*^2  =  2E{e*(A;)Y*(fc)}.  (4-12) 

Using  (4-8)-(4-12)  and  after  some  simplification,  we  have 

Vc-^(C)  =  {m.E{|e(fc)fe(fc)Y*(fc)} 

-m2m2E{e\k)Y*ik)}  -  [m,  -  |m2H^{e(fc)Y*(fc)}}  . 

(4-13) 

We  will  use  a  stochastic  gradient  method  for  recursifica- 
tion  of  maximization  of  J{0)  using  an  ‘instantaneous*  gra¬ 
dient  as  an  estimate  of  (4-13  j.  Given  the  estimate  C(A:  —  1) 
of  the  tap-gains  at  time  fc  —  1,  the  stochastic  gradient 
method  computes  the  update  C(/s)  at  time  k  2ls 

C(A;)  =  C(fc  -  1)  +  fii  Vc-  -  1))  1^) 


C(fc) 


c(fc) 

l|C(A:)l| 


(4-15) 


where  fii  is  the  update  step-size  and  —  1))  is 

an  instantaneous  gradient  of  the  cost  J  (w.r.t.  C*)  at  time 
k  evaluated  at  C(ib  —  1).  Since  the  cost  J  is  invariant  to 
any  scaling  of  C,  we  normalize  C  in  (4-15)  to  have  a  unit 
norm.  From  (4-13)  we  have  the  approximation 


S7c^Jk{C{k))  = 


sgn(74fc)“^  {  [m2k  {e^{k)  -  m2k)  e*{k) 

-  (m4fc  -  Imjfcl")  e(fc)]  Y*(fc)}  (4  -  16) 

where 

m2k  =  (1  “  ^2)m2(fc_i)  +  fJ‘2\e{k)\^ t  (4  —  17) 

Tri2k  =  (1  - /^2)m2(fc-i)  +  /^2e^(fc),  (4-18) 

m4fc  =  (1  - /i2)m4(fc„i)  +  /i2|e(fc)|\  (4-19) 

74fc  =  m4fe  "  2m2fc  —  lm2fcl^  (4  —  20) 

and 

e{k)  =  C’^{k)Y{k).  (4-21) 

In  (4-17)-(4-19)  the  various  quantities  represent  estimates 
based  upon  sample  averaging,  the  (exponential  window) 
memory  being  controlled  by  the  forgetting  factor  ^2  (O  < 
fi2  <  1).  The  initializations  for  (4-17)-(4-19)  are:  m2o  = 
m40  =  77120  =  0- 


4.2.  First  Stage  Signal  Cancellation 
Now  we  discuss  implementation  of  (3-6)  via  sample  averag¬ 
ing  using  an  exponential  window  controlled  by  a  forgetting 
factor  /X3.  Define  (Ifi,Ir2  >  0) 

E{k)  =  [  e{k  +  Li)  e{k  +  L,-l)  •••  e{k  -  Li)  f 

(4-22) 

and 

Fi=[7i(-ii)  M-L,+l)  •••  fiiLi)^. 

(4-23) 

By  Sec.  3,  when  (4-8)  is  maximized,  e{k)  satisfies  (3-4)  so 
that  for  suitable  choice  of  Li  and  2/2,  there  exists  a  jo  G 
{1, 2,  •  •  • ,  M}  such  that 


X)  /Oo(0«'io(fe  -  0  =  Ff  E(A:).  i  =  1, 2,  •  • . . 

(4-24) 

In  order  to  implement  (3-6)  and  (3-7),  we  need  recursive 
estimates  of  Fi.  The  estimate  Ft  (A;)  of  Ft  at  time  k  is 
provided  by 

Ft(fc)  =  Rt(A:)/mec(A:)  (4 - 

where 

mee{k)  =  {1  —  fiz)mee{k  —  1)  fiz\e{k)\^ ,  (4—26) 

Rt(fc)  =  (1  -  /X3)Ri(fc  -  1)  +  fizyi{k)I^*{ky  (4  -  27) 
4.3,  Multistage  Algorithm 

In  Secs.  4.1  and  4.2  we  discussed  the  first  stage  of  the  al¬ 
gorithm  where  we  have  N  sensors  and  M  sources.  Now  we 
put  it  all  together  following  the  source-iterative  solution  of 
Sec.  3  and  discuss  extraction  of  M  sources  including  the 
cancellation  of  the  extracted  sources.  We  will  use  the  su¬ 
perscript  (m)  to  denote  the  various  quantities  pertaining  to 
stage  m.  These  have  been  used  previously  in  Secs.  4.1  and 
4.2  without  this  superscript;  e.g.  C^”^^(A:)  now  denotes  the 
estimate  of  the  tap-gain  vector  at  time  k  at  stage  m,  etc. 
Initialization: 


Y('’(jt)  =  as  in  (4-3) 
DO  FORm  =  l,2,--,M: 


(4-28) 


(4-29) 
(4-30) 


|lC(-)(fc)|| 

where 

sgn(7ir’);;^  { h- 

-  Y(-)*(i:)}  ,  (4  -  31) 

=  (1  -  (4  -  32) 

=  (1  -  +  M2e(’")^(fc),  (4  -  33) 

+  A‘2le('">(*)r.  (4-34) 

Tir^  =  -  2  (4  -  35) 

and 

e^’"^(fe)  =  d’"^^(fc)Y^’"^(fc).  (4-36) 


Set 


(2r,  15°, -0.65) 


5<.”')(fc)  =  (4-37) 

where  represents  (cf.  (3-7))  the  contribution  of 

the  extracted  source  at  the  m—th  stage  to  the  mea> 
surement  at  time  k  at  the  t— th  sensor,  and  where 

F<r\k)  =  R(’")(fc)/m<rHfc).  (4-38) 

m<?>(fc)  =  (1  -  -  1)  + 

£(”*>(/(:)=[  e( "')(*  + ^i)  ••• 

(4  -  40) 

and 

R<r\k)  =  +  tizy\’^\k)E^”'>(k). 

(4-41) 

Define 

n-^\k)=[^-'\k)  yl”')(fc-i.-t-l) 

(4-42) 

Set 

Y(”‘+’->(jfe)  =  [  y/”*+‘>^()fe) 

(4-43) 

where 


tx;3:  (r,^,a)  =  (0^,-35^0,7),  (lT,-5^0.4). 

Thus  the  signals  wi  and  wz  propagate  through  two  paths 
whereas  W2  parses  through  three  paths.  The  signals  arriv¬ 
ing  at  the  array  were  normalized  such  that  the  signal  powers 
for  users  1  and  2  are  equal,  and  3dB  higher  than  the  sig¬ 
nal  power  for  user  3.  Additive  white  (both  temporally  and 
spatially)  Gaussian  noise  was  added  to  the  array  measure¬ 
ments  to  achieve  a  signal- to-noise-ratio  (SNR)  of  ll.SBdB 
(ratio  =  100/7)  for  the  strongest  user(s). 

The  proposed  approach  was  applied  with  M  —  3  equal¬ 
izers  and  Af  —  1  =  2  signal  cancellers  running  in  parallel, 
each  successive  equalizer  put  in  operation  after  waiting  for 
200  samples  (symbols)  w.r.t.  the  previous  stage.  The  equgJ- 
izer  length  was  chosen  to  be  5  taps  per  sensor  (Ir«  =  5  in 
(4-2)).  The  initial  guess  for  the  tap  gains  was  taken  to  be 
center-tap  initialization:  set  Ct(2)  —  1  for  i  =  m  for  the 
m~th  stage  equalizer  (m  =  1,2,3)  with  the  remaining  tap 
gains  set  to  zero.  The  algorithm  step  sizes  and  forgetting 
factors  for  each  stage  m  were  chosen  as:  fii  =  0.003  in  (4- 
29),  fi2  =  0.015  in  (4-32)-(4-34)  and  fiz  =  0.0005  in  (4-39) 
and  (4-41).  For  the  running  cost  (4-45)  computation  we 
selected  /i*  =  0.002  in  (4-46)-(4-48).  The  parameters  Li 
and  L2  in  (4-40)  (see  also  (4-22)  and  (4-23))  were  selected 
as  Li  =  15  and  L2  “  6. 


y/’"+^>(fc)  =  Y^”'\k)  -  Yl'^\k).  (4-44) 


ENDDO 

The  sequence  {c^’”^(fc)}  in  (4-36)  represents  the  equalized 
(up  to  a  scale  factor  and  time  delay)  source  at  stage  m. 
Running  Cost,  To  monitor  the  convergence  of  the  equal¬ 
izers  in  various  stages  of  the  algorithm,  it  is  useful  to  calcu¬ 
late  a  running  cost  (4-8)  without  the  sign.  Let  denote 
the  running  cost  for  the  m— th  stage  at  time  fc,  given  by 


j{rrL) 


-2 


(4  -  45) 


where 

=  (1  -  +  M4|e^’"^(fe)P,  (4  -  46) 

=  (1  -  +  M4e'’">"(fc),  (4  -  47) 

=  (1  -  M4)m(p,Lo  +  /‘4|e("‘’(*)|*.  (4  -  48) 

For  all  of  the  simulations  presented  in  Sec,  5,  we  took  ^4  = 

0.002. 


5.  SIMULATION  EXAMPLES 
5.1.  Example  1;  3  BPSK  Sources  and  7  Sensors 
We  consider  a  wireless  communications  scenario  with  three 
(M  =  3)  BPSK  user  signals  arriving  at  a  uniform  linear 
array  of  W  =  7  sensors  via  a  frequency  selective  multipath 
channel.  The  array  elements  are  spaced  half  a  wavelength 
apart.  The  array  measurements  are  assumed  to  be  sam¬ 
pled  at  baud  rate  (for  simulation  convenience  only)  with 
sampling  interval  T  seconds  and  the  three  sources  have  the 
same  baud  rate.  The  relative  time  delay  r  (relative  to  the 
first  arrival^  the  angle  of  arrival  0  (in  degrees  w.r.t.  the 
array  broaaside)  and  the  relative  attenuation  factor  (am¬ 
plitude)  a  for  various  sources  were  selected  as: 

:  (r,0,a)  =  (OT,lO%O.5),  (1T,50^0.75) 
u;2:  (r,0,a)  =  (OT,-2O^O.5),  (IT,  45",  0.45), 


7  sensors,  3  BPSK  sources 


(avergaed  over  100  runs) 


Fig.  1.  Average  running  cost  for  Example  1. 

Fig.  1  shows  the  evolution  of  the  average  running  cost 
(see  (4-45)),  averaged  over  100  Monte  Carlo  runs  af¬ 
ter  ‘assigning’  each  equalizer  cost  to  its  corresponding  ex¬ 
tracted  source.  For  BPSK  sources  the  4th-order  normalized 
cumulant  equals  —2;  therefore,  at  convergence,  the  running 
cost  (4-45)  should  be  close  to  -2.^  In  Fig.  1  we  see  these 
values  to  oe  around  —1.89  which  is  largely  a  consequence 
of  noise  in  the  data  which  affects  only  the  denominator  of 
(4-45)  making  it  larger  than  it  should  be.  Table  1  shows 
the  signal- to-interference- an d-noise  ratio  (SINR)  and  the 
probability  of  error  P*  at  the  output  of  each  equalizer  at 
selected  time  instants,  averaged  over  100  Monte  Carlo  runs 
and  3000  symbols.  [The  equalizer  tap  gains  at  the  cho¬ 
sen  time  instants  were  ‘frozen*  and  used  to  equalize  data 
of  length  3000  symbols  in  order  to  calculate  SINR  and  P®. 
The  equEdized  data  were  rotated,  scaled  and  shifted  before 
calculating  the  two  performance  measures.]  It  is  seen  from 
Fig.  1  and  Table  1  that  the  proposed  approach  works  well. 


5.2.  Example  2:  2  Mixed  Sources  and  5  Sensors 

In  this  example  we  consider  a  4- QAM  user  signal  wi  (4th 
normalized  cumulant  as  —  1 )  and  a  non-communications  sig¬ 
nal  W2  consisting  of  an  i.i.d.  complex  Gaussian-mixture  (in¬ 
dependent  and  identically  distributed  real  and  imaginary 
parts  with  the  real  part  being  A/^(0,l)  with  probability  0.9 
and  A/^(0,4)  with  probability  O.l)  with  4th  normalized  cu¬ 
mulant  as  0.7433  .  The  multipath  channels  for  the  two 
signals  were  selected  as: 

wi:  (r,^,a)  =  (OT,10°,0.5),  (IT,  50° ,  0,75) 

W2:  (r,0,a)  =  (OT, -20°,0.5),  (IT,  45°,  0.45), 

(2T,  15°, -0.65). 

The  two  signals  have  equal  power.  Additive  white  Gaussian 
noise  was  added  to  the  array  measurements  to  achieve  an 
SNR  of  13dB  (ratio  =  20)  for  each  user  signal. 

The  proposed  approach  was  applied  with  M  =  2  equal¬ 
izers  and  M  —  1  =  1  signal  cancellers  running  in  parallel, 
the  second  equalizer  put  in  operation  after  waiting  for  200 
samples.  The  equalizer  length  was  chosen  to  be  5  taps  per 
sensor  (Lg  =  5  in  (4-2)).  The  initial  guess  for  the  tap  gains 
was  taken  to  be  center-tap  initialization:  set  Ci(^2)  =  1  for 
i  =  m  for  the  m— th  stage  equalizer  (m  =  1,2)  with  the 
remaining  tap  gains  set  to  zero.  The  algorithm  step  sizes 
and  forgetting  factors  were  chosen  as:  =  0.0005  in  (4- 

29),  fi2  ==  0.015  in  (4-32)-(4-34)  and  fiz  =  0.0005  in  (4-39) 

and  (4-41)  when  <  0  (see  (4-35)),  and  fii  =  0.0001 
in  (4-29),  fi2  =  0.003  in  (4-32)-(4-34)  and  fiz  =  0.0005  in 
(4-39)  and  (4-41)  when  >  0.  The  parameters  L\  and 
Z-2  in  (4-40)  were  selected  as  Xi  =  15  and  L2  =  6.  For  the 
running  cost  (4-45)  computation  we  selected  =  0,002  in 
(4-4 6)- (4-48).  The  counterparts  to  Fig.  1  and  Table  1  are 
now  shown  in  Fig.  2  and  Table  2,  respectively. 

5  sensors,  2  ‘mixed*  sources 


(avergaed  over  100  runs) 


Fig,  2,  Average  running  cost  for  Example  2. 
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i|  TABLE  1  II 

#of 

1  User  1  II 

samples 

SINR(dB) 

Pe 

4000 

14.89 

<  3  X  10-‘ 

6000 

15.03 

<  3  X  lO"* 

8000 

15.13 

<  3  X  10“^ 

12000 

15.14 

<  3  X  10“^ 

1  User  2  || 

4000 

14.25 

0.0099 

6000 

14.81 

<  3  X  10“‘ 

8000 

14.94 

<  3  X  10“^ 

12000 

14.97 

<  3  X  10“^ 

1  User  3  || 

4000 

14.84 

<  3  X  10“^ 

6000 

15.13 

<  3  X  10“* 

8000 

15,19 

<  3  X  10“* 

12000 

15.23 

<  3  X  10“* 

1  TABLE  2  II 

■  #of 

User  1  1 

User  2 

samples 

SINR(dB) 

Pe 

SINR(dB) 

8000 

14.12 

0.0072 

5.52 

12000 

14.79 

<  3  X  10“* 

9.28 

16000 

15.06 

<  3  X  10“* 

11.49 

[8] 


BLIND  EQUALIZATION  OF  I.I.R.  SINGLE-INPUT  MULTIPLE-OUTPUT 
CHANNELS  WITH  COMMON  ZEROS  USING  SECOND-ORDER  STATISTICS 

Bin  Huang  Jitendra  K.  Tugnait 

Department  of  Electrical  Engineering 
Auburn  University,  Auburn,  Alabama  36849,  USA 
E-mail:  huangbi@eng.auburn.edu  tugnait@eng.auburn.edu 


ABSTRACT 

The  problem  of  blind  equalization  of  SIMO  (single-input 
multiple-output)  communications  channels  is  considered  us¬ 
ing  only  the  second-order  statistics  of  the  data.  Such  mod¬ 
els  arise  when  a  single  receiver  data  is  fractionally  sampled 
(assuming  that  there  is  excess  bandwidth),  or  when  an  an¬ 
tenna  array  is  used  with  or  without  fractional  sampling. 
We  focus  on  direct  design  of  finite-length  MMSE  (mini¬ 
mum  mean-square  error)  blind  equalizers.  Unlike  the  past 
work  on  this  problem,  we  allow  infinite  impulse  response 
(UR)  channels.  Our  approaches  also  work  when  the  “sub¬ 
channel”  transfer  functions  have  common  zeros  so  long  as 
the  common  zeros  are  minimum-phase  zeros.  Illustrative 
simulation  examples  are  provided. 

1.  INTRODUCTION 

Consider  a  discrete-time  SIMO  system  with  N  outputs  and 
one  input.  The  i-th  component  of  the  output  at  time  k  is 
given  by 

yi{k)  =  Ti{z)w{k) ni{k)  ,  z  =  1,  2,  •  •  • ,  A",  (1-1) 

y(^)  =  ^iz)w{k)  +  ii(fc)  =  s(A;)  +  ii(fc),  (1  -  2) 

where  y{k)  =  [2/i(fe)  :  y^fc)  :  •  •  •  '.yN{k)f,  similarly  for  s{k) 
and  n(fc),  and  z  is  the  2;— transform  variable  as  weU  as  the 

backward-shift  operator  (i.e.,  z~^w{k)  =  w{k  —  1),  etc.). 
The  sequence  u;(A;)  is  the  (single)  input  at  sampling  time 
k,  yi{k)  is  the  i-th  noisy  output,  St(A;)  is  the  i-th  noise-free 
output,  ni{k)  is  the  additive  measurement  noise, 

oo 

:=  (1-3) 

1=0 

and  ^  scalar  transfer  function 

with  w{k)  as  the  input  and  yi{k)  as  the  output;  it  represents 
the  i— th  subchannel.  We  allow  all  of  the  above  variables 
to  be  complex- valued. 

Such  models  arise  in  several  useful  baseband-equivalent 
digital  communications  and  other  applications.  A  case  of 
some  interest  is  that  of  firactionally-spaced  samples  of  a 
single  baseband  received  signal  leading  to  a  SIMO  model 
[1],[4],[8].  Alternatively,  a  similar  model  can  be  derived 
when  we  have  a  single  signal  impinging  upon  an  antenna 
array  with  N  elements  [5].  A  similar  model  arises  if  we  have 
an  antenna  array  coupled  with  fractional,  sampling  at  each 
array-element  [5].  In  these  applications  one  of  the  objectives 
is  to  recover  the  inputs  w{k)  given  the  noisy  measurements 
but  not  given  the  knowledge  of  the  system  transfer  function. 
An  overwhelming  number  of  papers  (see  [4],[5],[9]-[12]  and 
references  therein)  have  concentrated  on  a  two-step  proce¬ 
dure:  first  estimate  the  channel  impulse  response  (IR)  and 
then  design  an  equalizer  using  the  estimated  channel.  A 
fundament zd  restriction  in  these  works  is  that  the  channel 

This  work  was  supported  by  NSF  Grant  MIP-9312559  and 
by  ONR  Grant  NOOOl 4-97-1-0822. 


is  FIR  with  no  common  zeros  among  the  various  subchan¬ 
nels.  A  few  (see  [l]and  [13],  e.g.)  have  proposed  direct 
design  of  the  equalizer  bypassing  channel  estimation.  Still 
they  assume  FIR  channels  with  no  common  zeros. 

In  this  paper  we  allow  HR  channels.  We  will  also  al¬ 
low  common  zeros  so  long  as  they  are  minimum-phase.  Fi¬ 
nally,  in  the  presence  of  nonminimum-phase  common  zeros, 
our  proposed  approach  equalizes  the  spect rally-equivalent 
minimum-phase  counterpart  of  T{z)\  it  does  not  “fall 
apart”  unlike  quite  a  few  existing  approaches.  We  should 
note  that  our  proposed  approach  is  inspired  by  [1].  Unlike 
[1]  our  approach  applies  to  antenna  arrays  since  we  do  not 
require  that  /i(0)  0  but  /t(0)  =  0  for  z  =  2, 3,  •  •  • ,  iV  (as 

in  [1]). 

2.  PRELIMINARIES 

2.1.  FIR  Inverses 

Let  T{z)  —  A''^{z)B{z)  where  A{z)  =  1  +  ^17=1  is 

1x1  and  B{z)  =  X]r=o  is  iV  x  1.  Assume 

(HI)  N>1. 

(H2)  Rank{5(z)}  =  1  Vz  including  z  =  oo  but  excluding 
z  =  0,  i.e.,  B[z)  is  irreducible  [7,  Sec.  6.3]. 

(H3)  A{z)  /  0  for  \z\  >  1. 

It  has  been  shown  in  [6]  (using  some  results  from  [2])  that 
under  (Hl)-(H3)  there  exists  a  finite  degree  left-inverse 
(not  necessarily  unique)  of  JF(z): 

g{z)j^{z)  =  1  (2-1) 

where  ^(z)  is  1  X  iV  given  by 

g{z)  =  Giz~^  for  any  Le  >na  +nb  —  1.  (2  —  2) 

1=0 

Remark  1:  The  left-inverse  g{z)  of  ^(z)  consists  of 
two  parts:  Q{z)  =  Gb{z)A{z)  where  Qb{z)B{z)  =  1  so 
that  G{z)T{z)  =  Pb{z)A{z)A''^(z)B(z)  =  gB{z)B{z)  =1. 
Finite  length  left-inverses  of  FIR  SiMO  channels  nave  been 
subject  of  intense  research  activities  [4]- [6], [8]- [13]. 

2.2.  Linear  Innovations  Representations 
Assume  further  the  following: 

(H4)  {it;(fc)}  is  zero-mean,  white.  Take  E{|ti;(A;)l^}  =  1. 

Lemma  1.  Under  (Hl)-(H4),  {s(A;)}  may  be  represented 
as 

M 

s(k)  =  Dis(fc  -  t)  +  Uk)  (2  -  3) 

t=l 

where  M  =  na  —  1,  Di’s  are  some  N  kN  matrices  such 
that  det(X>(z))  5?^  0  for  |z|  >  1,  V{z)  =  J  -{-  DtZ“' 

and  {Is{k)}  is  a  zero-mean  white  JV  x  1  random  sequence 
(linear  innovations  for  {s(fc)})  with 

E{I,{k)l'^{k)}  =  FoFV^  and  \\Fo\\-^F^I.(k)  =  Mk). 

(2  —  4) 


Proof:  Consider  the  process 

s'(it)  :=  ^(z)8(fc)  =  B{z)w{k).  (2  -  5) 

By  [9]  and  [14],  under  (HI),  (H2)  and  (H4),  we  have 

nt-l 

s'(fc)  =  -  52  D:s'(fc  -i)  +  im  (2  -  6) 


where  Djs  are  some  NxN  matrices  such  that  det(I>'(z))  7^ 
0  for  \z\  >  1,  VXz)  =  /  +  and  {I's{k)}  is  a  zero- 

mean  white  N  X  1  random  sequence  with 


E{i:ik)l'.'^m  =  FoK  and  WFor^F^l'.ik)  =  w(k). 

(2  —  7) 

Since  s(fc)  =  A^^{z)s\k)y  it  follows  from  (2-6)  that  (2-3) 
holds  true  with  Ia(A:)  =  ^'s{k)  such  that  V{z)  =  A{z)V'{z), 
This  completes  the  proof.  □ 

Lemma  2.  Let  UssLe  denote  a  [N{Le  +  1)]  x  [N{Le  +  1)] 
matrix  with  its  ij-th  block  element  as  Kaa{j  —  i)  =  E{s{k^ 
j  -  i)s^{k)}.  Then  under  (Hl)-(H4),  p(TlaaL^)  <  NLe-\-l 
for  Le  >  Tia  +  n*,  —  1  where  p{A)  denotes  the  rank  of  A.  • 
Sketch  of  proof:  It  follows  from  Lemma  1  and  (2-3)  that 

[I  Di  •••  Dn.+nb-l  0  0 

=  [  FoFj'  0  •  •  •  0  ] .  (2-8) 

Apply  Sylvester’s  inequality  [7,  p.  655]  to  (2-8)  to  deduce 
the  desired  result.  □ 

3.  BLIND  EQUALIZATION:  NO  COMMON 
ZEROS 

Assume  that  (Hl)-(H4)  hold  true.  In  addition  assume  the 
following  regarding  the  measurement  noise: 

(H5)  {n(A:)}  is  zero-mean  with  E{n{k  +  'r)n^(A;)}  = 
(Tn^NxN  where  Inxn  is  the  iV  x  iV  identity  ma¬ 
trix. 

3.1.  Zero-Delay  Zero-Forcing  Equalizer 

Using  (1-3),  (2-1)  and  (2-2),  we  have 

00  r  n 

=  ZzU,...,  (’-') 

1=0 


leading  to 


[Go  Gi  Gx,.  ]S  —  [  1  0  ••• 

where  S  is  the  (JV(i.  +  1))  X  oo  matrix  given  by 


(3-2) 


S  = 


Fo  Fi  F2  Fj  . 

0  Fo  Fi  F2  . 


0  0  •  • •  0  Fo  Fi 


(3-3) 


Let  ^  denote  the  pseudoinverse  of  5.  By  [15,  Prop,  l], 

^  =  S^(SS^)*.  Then  the  minimum  norm  solution  to 
the  FIR  equalizer  is  given  by  [15,  Sec.  6.11] 


[Go  Gi  Gi.  ]=[f2^  0  0](55^)*. 

(3-4) 

In  a  fashion  similar  to  T^asht  Lemma  2,  let  'JtyyL^ 
denote  a  [N{Le  H-  1)]  x  [N{Le  +  1)]  matrix  with  its  t;-th 


block  element  as  Ryy{j~i)  ~  E{y{k-{- j  -i)y'^{ky}\  define 
similarly  Tlnnh^  pertaining  to  the  additive  noise.  Carry  out 
an  eigendecomposition  of  l^yyLe-  Then  the  smallest  N  1 
eigenvalues  of  'J^yyL^  equal  <j^  because  under  (Hl)-(H4), 
pi'flaaL^)  <  NLe  “h  1  whcreaS  p(7?.nnL«)  =  N N  — 
p{'JlyyL^)’  Thus  a  consistent  estimate  is  obtained 

by  taking  it  as  the  average  of  the  smallest  N  —  1  eigenvalues 

of  HyyL^i  the  data-based  consistent  estimate  of  HyyL^^ 
Under  (H4)  and  (H5), 


(<S  <S  )  —  P'ssLts  —  '^yyl'e  ~~  T^nnLtt  —  'J^yyLg  • 

(3  ”5) 

Thus,  {S  S  )  can  be  estimated  from  noisy  data.  However, 
we  don’t  know  Fo-  To  this  end,  we  seek  bji  N  x  N  FIR 
filter  Gaiz)  :=  satisfying 


[  GaO  Gal  •••  Gan.  ]  =  [  ^NxN  0 
Comparing  (3-4)  and  (3-6)  it  follows  that 
[Go  Gi  •••  Gl.  ]  =  Fj*[GaO  Gal  • 
leading  to 

n. 

^Giz-’  =:  g(z)  =  F'^Gaiz). 

t  =  0 


0 

(3-6) 


Gai,.  ] 
(3-7) 


(3-8) 


In  practice,  therefore,  we  apply  Qa{z)  to  the  data 
to 

y{k)  :=  Ga{z)y{k)  =  v,(A;)  -f  5a(2^)n(A5) 
such  that 

E^Vs{k)  =  w{k) 


leading 

(3-9) 

(3-10) 


where 


v.(Jfe)  :=  Ga{z)  [y{k)  -  nik)]  =  g.{z)s{k).  (3-11) 

In  (3-lOj  {‘*^(^)}  is  a  white  scalar  sequence  (by  as¬ 
sumption  (H4)),  however,  {va(/:)}  is  not  necessarily  a 
white  vector  sequence.  Given  the  second-order  statistics 
of  {va (/:)},  how  does  one  estimate  Fo  so  that  {it/(A;)}  sat¬ 
isfying  (H4)  is  recovered?  We  need  to  have  Rww{'r) 
E{w(k  -h  r)ti;*(A:)}  =  0  for  (r]  7^  0.  By  (3-9),  ^11/14/(7")  = 
E'oRv,v,{r)Fo.  Define  (L  >  0  is  some  large  integer) 


iiv...  :=  [<v.(-l)  <..(-2)  ••• 

(3-12) 

where  RvgVfi'r)  :=  E{va{k  +  7~)v^(A;)}. 

Lemma  3.  Rv,v,  is  rank  deficient  for  any  L  >  1  such 

that  Ri/,i/,Fo  =  0.  • 

Proof:  We  have 

R^..{r)  =  E{w{k  +  r)yVm  =  ^  Vr  >  1  (3-13) 

because  Va(Jfe)  is  obtained  by  causal  filtering  of  y(fc),  hence 
of  u;(A:).  Using  (3-10)  in  (3-13)  it  then  follows  that  there 

exists  a  X  1  Fo  ^  0  such  that  FJ^ •^v*v.('r)  =  0  Vr  >  1. 
Equivalently  (since  i?v,i/.(— 7")  = 

Rv.v.{-r)Fo^0  Vr>l.  (3-14) 

The  desired  result  is  then  immediate,  □ 

Pick  a  W  X  1  column- vector  Ho  to  equal  the  right¬ 
most  right  singular  vector  in  a  singular- value  decomposition 


\ 


(SVD)  Rv,v,  =  i.e.  the  right  singular  vector  cor¬ 

responding  to  the  smallest  singular  value.  In  other  words, 
pick  Ho  to  equal  the  last  column  of  V.  Then  since  ide¬ 
ally  the  smallest  singular  value  of  Rv^v^  is  zero,  we  have 
lS.'^Rv,v,{r)Ro  =  0  for  r  =  1,2,---,X.  Since  the  over¬ 
all  system  with  w[k)  as  input  and  Hj^Va(fc)  as  output  is 
ARMA(7ia,nb  +  ic),  it  follows  that  Ho^Va(A;)  is  zero-mean 
white  if  L  >  Ub+Ley  hence,  a  scaled  version  oiw{k).  There¬ 
fore,  we  have  (a  0) 

tL^Vsik)  =:  w\k)  ~  aw{k)  (3-15) 

(because  'Rv.v.'H.o  =  OV  Thus,  once  Ho  is  found,  one  has 
the  complete  inverse  niter  to  recover  a  scaled  version  of 
w{k)  via  a  zero-forcing  filter. 

Hemark  2:  Fo  can  also  be  estimated  (up  to  a  scale 
factor  as  unit  norm  Ho)  using  the  prediction  error  method 
of  [9], [14]  (even  though  [9]  and  [14]  restrict  their  discussion 
to  FIR  models  and  real-valued  data).  Using  (2-3)  we  obtain 

(-^c  ^  1) 

[  Di  ...  Dl.  ]  UssL.  =  -  [  R..(l)  •  •  •  ^^s{Le)  ] 

(3-16) 

leading  to  the  minimum  norm  solution 
[  Di  •••  Di,.  ]  =  -[  R„(l)  •••  R»(i.) 

(3-17) 

Note  that  if  >  Ua  nt  -  1,  then  Di  =  0  for  all  t  > 
ria  +  nfe  -  1  by  Lemma  2.  By  (2-3)-(2-4)  we  have 

R/i(0)  =  FoFJ^  =  R„(0)  +  (3  -  18) 

t=l 

Clearly  p(R3  3(0))  =  1.  Carry  out  an  eigendecomposition 
of  Rji(O).  Pick  Ho  as  the  unit  norm  eigenvector  corre¬ 
sponding  to  the  largest  eigenvalue  (ideally  the  only  nonzero 
eigenvalue)  of  Rij(0).  O 

Remark  3:  It  is  worth  noting  that  although  Fj^Va(A;)  = 
w{k)  (see  (3-10))  and  ||Fo||~^FJ^/3(A;)  =  w{k)  (see  (2-4)), 
{/a(ib)J  is  zero-mean  white  (linear  innovations)  whereas 
{va(/5)}  is  in  general  colored.  □ 

3.2.  MMSE  Equalizer  with  Delay  d 
We  wish  to  design  an  MMSE  linear  equalizer  of  a  specified 
length.  Using  the  orthogonality  principle  [16],  the  MMSE 
equalizer  of  length  2/c-(-l  to  estimate  w(^k  —  d)  (d  >  0)  based 
upon  y(n),  n=:fc,  fc  —  1,...,A;  —  Ley  satisfies 

[  Gdfi  Gd,i  *  *  •  ^d,Le  ]  ~ 

[F«  F?_,  •••  0  ...  (3-19) 

where  HyyLe  has  its  ij-th  block-element  given  by  Ryy(i 
t).  Clearly  one  can  obtain  a  consistent  estimate  of  RyyL^ 
horn  the  given  data.  It  remains  to  estimate  Fi’s  to  complete 
the  design.  Here  the  discussion  of  Sec.  3.1  becomes  relevant. 
There  we  found  a  Ho  to  satisfy  (3-15).  From  (3-9)  and  (3- 
15)  we  have 

Le 

,  n^y,{k)  =  HS^Gais(n  -  i).  (3  -  20) 

trrO 

Using  (1-2),  (3-15)  and  (3-20),  we  have 

F^  =  a‘”’’Ho^  GaiR^(T  4-  i). 

1  =  0 


Let  Tld^saLe  denote  a  [A'(Le  +  l)]  x  [iV(jLe  +  1)]  matrix  with 
its  ij-ih.  block  element  as  E{s{k  -f  d  -f-j  —  i)s^(A;)}.  Using 
(3-6)  and  (3-21)  in  (3-19)  we  obtain  the  desired  solution 

[  Gd.o  Gd,i  •  •  •  ] 

=  a  ^HJ^  [  InxN  0  ...  0  ]RfsLe^d,3aLe'^yyL^’ 

(Z  -  22) 

A  scaled  MMSE  estimate  of  w{t  —  d)  is  then  given  by 

w{t'-~d)  =  aGd,ty(t  —  i).  (3  —  23) 

t=0 

3.3.  Summary  of  Algorithms 

Given  data  y{k),  k  =  1,  2, . .  * ,  T.  Pick  the  length  X*  4-  1 
and  delay  d  of  the  MMSE  equalizer.  Estimate  all  correlation 
functions  by  sample  averaging. 

3.3.1,  ALGORITHM  I: 

Here  Fo  is  estimated  as  the  unit  norm  Ho  that  lies  in 
the  nuU  space  of  Estimate  noisefree  correlations  via 

(3-5^  Use  (3-22)  and  (3-23)  for  MMSE  equalizer  design. 

3.3.2,  ALGORITHM  II : 

Here  Fo  is  estimated  as  in  Remark  2.  The  rest  is  as  in 
ALGORITHM  1. 

3.3.3,  ALGORITHM  III  : 

Here  we  will  use  (3-19)  with  F^  (t  =  0, 1, . . . ,  d)  estimated 
using  the  basic  approach  of  [9]  and  [14].  Although  [9]  and 
[14]  derive  all  their  results  under  the  assumption  of  FIR 
channels  with  no  common  zeros,  their  results  extend  (with 
straightforward  modifications)  to  models  that  satisfy  (Hl)- 
(H5)  by  virtue  of  Lemma  1. 

4.  BLIND  EQUALIZATION:  COMMON  ZEROS 

4.1.  Minimum-Phase  Zeros 
Here  the  SIMO  transfer  function  is 

Hz)  =  [Beiz)/A{z)]Biz)  (4-1) 

where  B{^z)  satisfies  (H2)  and  Bc{z)  is  a  finite-degree  scalar 
polynomial  that  collects  aU  the  common  zeros  of  the  sub¬ 
channels.  Assume  that 

(H6)  Given  model  (4-1),  Bc{z)  ^  0  for  \z\  >  1. 

Then  while  .4"‘^(z)H(z)  has  a  finite  inverse,  Be^{z)  is  HR 
though  causal  under  (H6).  Then  (3-2)  holds  true  approx¬ 
imately  for  “large”  Ley  the  approximation  getting  better 
with  increasing  Xe-  Similarly  Lemma  1  holds  true  approx¬ 
imately  for  “large”  M  and  Lemma  2  also  holds  true  ap¬ 
proximately  for  Le  >  M.  It  is  then  readily  seen  that  the 
developments  of  Secs.  3.1,  3.2  and  3.3  are  applicable. 

4.2.  Arbitrary  Zeros 

In  this  case  (4-1)  is  true  but  Bc(z)  does  not  necessarily 
satisfy  (H6).  We  may  rewrite  (4-lj  as 

T{z)  =  Jiz)TAp{z)  (4-2) 

where  Tap{z)  is  an  allpass  (rational)  function  such  that 

Bc{z)Bc{z^^)  =  T^ap{z)Bmp{z)  (4  “  3) 

and  Bmp{z)  is  minimum- phase.  Thus  (within  a  scale  fac¬ 
tor)  we  have 


(3-21) 


T{z)  =  [Bmp{z)IA{z)]  B{z). 


(4-4) 


We  may  rewrite  (1-2)  as 

y[k)  =  !F(z)w\k)  +  n(A:)  where  w\k)  :=  TAp{z)w{k), 

(4  -  5) 

Clearly  w'(k)  satisfies  (H4).  Hence,  (4-4)-(4“5)  satisfy 
the  requirements  of  Sec.  4.1.  Therefore,  one  can  “ap¬ 
proximately”  recover  from  the  given  data  by  ap¬ 

plying  the  algorithms  of  Sec.  3.3.  In  order  to  recover 
wlk)  form  w^(k)y  one  needs  to  exploit  the  higher-order 
statistics  of  see  [2], [3]  and  references  therein. 


Example  1,  delay=3 


Fig.  1.  Normalized  MSE  after  MMSE  equalization  with 
d  =  3.  Solid  lines:  T  =  250  symbols,  dashed  lines:  T  = 
1000  symbols. 

5.  SIMULATION  EXAMPLES 

5.1.  Example  1. 

We  have  iV*  =  3  in  (1-2)  with  F[z)  =  A~~^{z)B{^z)  where 

>1(z)  =  (1-0.5z-^)73x3  (5-1) 

and  B{^z')  is  3  x  1  MA(6)  obtained  from  [10]  as  follows.  Con¬ 
sider  a  raised  cosine  pulse  p6(t,  O.l)  with  a  roll-off  factor  0,1, 
truncated  to  a  length  of  6Ts  (T,  =  symbol  duration).  As  in 

[10],  a  two-ray  multipath  channel  with  (effective)  impulse 
response  h{t)  =  p6(t,  0.1)  — 0.7p6(t  —  T3/3,  0.1)  was  sampled 
at  intervals  of  Ta/3  (starting  at  t  =  — 3Ts)  to  create  the  B(z) 
above.  Transfer  function  B\z)  satisfies  (H2)  [10],  therefore, 
there  exists  a  finite  left  inverse  of  length  Le  =  ^  (cf.  Sec. 
2.1).  The  scalar  input  w{k)  is  4-QAM.  An  MMSE  equalizer 
of  length  Lc  =  8  (9  taps  per  subchannel,  totaling  27  taps 

- overfitting)  was  designed  with  a  delay  d  =3  (arbitrarily 

selected  just  for  illustration).  The  Algorithms  I-III  were 
applied  for  record  lengths  T  =  250  and  1000  symbols  with 
varying  SNR’s.  Fig.  1  shows  the  normalized  MSE  (MSE 
divided  by  E{|t(;(A;)|^}).  It  is  seen  that  the  proposed  de¬ 
sign  approach  can  handle  HR  channels  with  little  difficulty. 
Algorithm  I  (newly  proposed)  performs  the  best. 

5.2.  Example  2. 

Again  we  have  N  =  3  in  (1-2)  but  with  .F(z)  =  Bc{z)B(^z) 
where  B(^z)  is  as  in  Example  1  and  Bc{z)  is  a  scalar  poly¬ 
nomial  given  by 

Sc(r)  =  1  -  0.5z“\  (5-2) 

Thus  all  three  subchannels  have  a  common  zero  at  0.5. 
The  input  ty(fc)  is  4-QAM  as  in  Example  1.  Note  that 
in  this  example  a  finite  left  inverse  does  not  exist.  As  in 
Example  1,  an  MMSE  equalizer  of  length  =  12  was 


designed  with  a  delay  d  =3.  Fig.  2  shows  the  normal¬ 
ized  MSE  averaged  over  100  Monte  Carlo  runs.  It  is  seen 
that  the  proposed  design  approaches  can  handle  subchan¬ 
nels  with  common  minimum-phase  zeros  with  little  diffi¬ 
culty.  As  in  Example  1,  Algorithm  I  performs  the  best. 


Example  2,  delay=3 


Fig.  2.  Normalized  MSE  after  MMSE  equalization  with 

d  =  3.  Solid  lines:  T  =  250  symbols,  dashed  lines:  T  = 

1000  symbols. 
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ABSTRACT 

This  paper  is  concerned  with  the  problem  of  blind  separa¬ 
tion  of  independent  signals  (sources)  from  their  linear  con- 
volutive  mixtures.  The  various  signals  are  assumed  to  be 
linear  non-Gaussian  but  not  necessarily  i.i.d.  Recently  an 
iterative,  normalized  higher-order  cumulant  maximization 
based  approach  was  developed  using  the  fourth-order  nor¬ 
malized  cumulants  of  the  “beamformed”  data.  A  byprod¬ 
uct  of  this  approach  is  a  decomposition  of  the  given  data  at 
each  sensor  into  its  independent  signal  components.  In  this 
paper  an  adaptive  implementation  of  the  above  approach 
is  developed  using  a  stochastic  gradient  approach.  Some 
further  enhancements  including  a  Wiener  Alter  implemen¬ 
tation  for  signal  separation  and  adaptive  filter  reinitializa¬ 
tion  are  also  provided.  A  computer  simtdation  example  is 
presented. 

1.  INTRODUCTION 

Given  noisy  measurements  =  1|2,  at 

time  k  bX  N  sensors,  let  these  measurements  be  a  lin¬ 
ear  convolutive  mixture  of  M  source  signals  Xj{k),  (j  = 


filter),  unlike  the  equalization  problems  where  they  must 
be  constant  gains  and/or  pure  delays. 

The  problem  considered  above  arises  in  a  wide  variety  of 
applications:  array  processing,  speech  enhancement  (“cock¬ 
tail  party”  problem),  and  noise  cancellation,  see  [1]-[12]  and 
references  therein.  The  prior  work  done  can  be  classified 
into  two  broad  categories  based  upon  the  underlying  propa¬ 
gation  model:  instantaneous  mixtures  and  convolutive  mix¬ 
tures.  The  general  model  (1-2)  represents  a  linear  convolu¬ 
tive  mixture.  The  work  reported  in  [4],  [7]  and  [11]  (and  ref¬ 
erences  therein)  deals  with  linear  convolutive  mixture  (dy¬ 
namic  mixing)  models.  Past  work  on  separation  of  convolu¬ 
tive  mixtures  may  be  categorized  into  several  classes:  time- 
domain  approaches  ([7],  [8],  [9],  [10]),  frequency-domain  ap¬ 
proaches  ([4], [11]),  adaptive  (recursive)  approaches  ([7],  [9], 
[10])  and  non-recursive  (batch)  approaches  ([4],  [8],  [11]). 
In  this  paper  we  present  time-domain  adaptive  approaches. 
Quite  a  few  of  existing  approaches  are  limited  either  to 
M  =  N  =  2  r[4],  [9])  or  to  M  =  JV  ([7]).  Although  [11] 
treats  a  general  case,  their  analysis  is  restricted  to  the  case 
of  two  sources  (M  =  2).  In  this  paper  we  consider  a  general 
case  of  JV  >  M  with  M  arbitrary. 

2.  MODEL  ASSUMPTIONS 


M 

»«■(*=)  =  ^  +  "»(*=)  >  »  = 
i=i 

=s.  y(fc)  =  U{zy{k)  +  n{k),  il  -2) 

where  ij— th  element  of  l({^z)  is  y{k)  = 

[yi{k):y2{k):--\yN{k)f,  similarly  for  x(/:)  and  ii(A:), 
z~^  is  both  the  backward-shift  operator  (i.e.,^  z~^x{k)  = 
x{k  —  1),  etc.)  as  well  as  the  complex  variable  in  the 
.g:— transform,  xj{k)  is  the  j-th  input  at  sampling  time  fc, 
yi{k)  is  the  i-th  output,  ni(A;)  is  the  additive  Gaussian  mea¬ 
surement  noise,  and  Uij{z)  :=  ^  scalar 

transfer  function  with  xj{k)  as  the  input  and  yi{k)  as  the 
output.  We  allow  all  of  the  above  variables  to  be  complex¬ 
valued. 

Suppose  that  we  design  a  MIMO  dynamic  system  £{z) 
with  N  inputs  and  M  outputs  such  that  the  overall  M  xM 
system 

T(z)  :=  £(z)U{z)  (1-3) 

decouples  the  source  signals.  Following  the  2x2  case  con¬ 
sidered  in  [4],  this  implies  that  we  must  have  {Tij{z)  denotes 
the  ij—th  element  of  T(z)) 


We  impose  the  following  conditions  on  model  (l-l)-(l-2): 

(ASl)  JV  >  M  (at  least  as  many  outputs  as  inputs). 

(AS 2)  The  vector  sequence  {x(A;)}  is  stationary,  its  var¬ 
ious  components  are  mutually  independent,  and 
U{z)  is  stable.  Moreover,  {x(fc)}  is  linear,  i.e. 

x(l:)  =  V{z)Yr(k),  (2-1) 


where  {w(/:)}  is  a  zero-mean,  Af— vector  station¬ 
ary  non-Gaussian  process,  temporally  i.i.d.  and 
spatially  independent,  with  nonzero  fourth  cumu¬ 
lants.  Because  of  the  mutual  independence  of  the 
components  of  x(A;),  we  take  V(^z)  to  be  diagonal. 

(ASS)  Consider  the  composite  system 


y(k)  =  J^{z)yi{k)+n{k),  with  T{z)  :=  W(z)V(z). 

(2-2) 

Assume  that  rank{.F(z)}  =  M  for  any  |z|  =  1. 
(AS 4)  Since  the  composite  system  is  causal,  we  have 


:F(z)  =  ;^F,z-‘  «  ^F.z-‘.  (2-3) 


1=0 


z=o 


Tij(z)  —0  for  (1-4) 

0  for  i  =  tj  ^  ' 

where  t  =  1, 2,  •  •  • ,  M ;  j  =  1,  2,  ■  ■  • ,  M  and  ij  G 
{1, 2,  •  •  • ,  M}  such  that  ij  /  ii  for  j  #  1.  That  is,  in  every 
column  and  every  row  of  T(z)  there  is  exactly  one  non-zero 
entry.  In  a  blind  separation  problem,  the  nonzero  entries 
of  T(z)  are  allowed  to  be  a  scalar  linear  system  (shaping 

This  work  was  supported  by  NSF  Grant  MIP-9312559  and 
by  ONR  Grant  NOOOl 4-97-1-0822. 


(ASS)  The  noise  {n(A;)}  is  a  zero-mean,  stationary  Gaus¬ 
sian  sequence  independent  of  {w(A:)}, 

Let  T^'\z)  denote  the  i— th  column  of  In  blind 

convolutive  signal  separation  we  are  interested  in  decom¬ 
posing  the  observations  at  the  various  sensors  into  its  inde¬ 
pendent  components.  That  is,  our  objective  is  to  estimate 
.F^*^(z)it;t(A:)  for  i  =  1, 2,  •  •  • ,  Af  given  {y(^)}  without  hav¬ 
ing  a  prior  knowledge  of  Denote  the  1^— th  element 

of  T{z)  as  Fij{z). 


3.  A  BATCH  SOLUTION  [8] 

In  this  section  we  briefly  discuss  the  batch  (non-recursive) 
approach  of  [8];  its  adaptive  version  is  developed  in  Sec. 
4.  Let  CUM4(u;)  denote  the  fourth-order  cumulant  of  a 
complex- valued  scalar  zero-mean  random  variable  Wj  de¬ 
fined  as 

CUM4(^)  =  -  2[S{l«;l^}f  -  \E{w^}f.  (3  -  1) 

Consider  an  1  x  iV'  row-vector  polynomial  equalizer  (filter) 
C^{z),  with  its  i-th  entry  denoted  by  Cj{z),  operating  on 
the  data  vector  y{k).  Let  the  equalizer  output  be  denoted 
by  e{k): 

N 

e(*)  =  (3~2) 

1  =  1 

Following  [6]  consider  maximization  of  the  cost 


_  |CUM4(e(fe))| 

-  [E{le(fc)p}P 


(3-3) 


for  designing  a  linear  equalizer  to  recover  one  of  the  inputs. 
It  is  shown  [6]  that  when  (3-3)  is  maximized  w.r.t.  C(z), 
then  (3-2)  reduces  to 

e{k)  =  dwjQ{k  —  fco),  (3  —  4) 

where  d  is  some  complex  constant,  ko  is  some  integer,  jo 
indexes  some  input  out  of  the  given  M  inputs. 

An  source-iterative  solution  is  given  by  [8]: 

Step  1.  Maximize  (3-3)  w.r.t.  C{z)  to  obtain  (3-4). 

Step  2.  Cross-correlate  {e{k)}  (of  (3-4))  with  the  given 
data  (2-2)  and  define  a  possibly  scaled  and  shifted 
estimate  of  fijoir)  as 


T  f  .  -  r)} 

la.K-r)  —  E{|e(fc)P} 


(3-5) 


where  Fij(z)  =  X)”-oo  Consider  now 

the  reconstructed  contribution  of  e(fc)  to  the  data 
yi(k)  (i  =  1, 2,  •  •  • ,  N),  denoted  by  yijoW- 


(3-6) 

I 

Step  3.  Remove  the  above  contribution  from  the  data  to 
define  the  outputs  of  a  MIMO  system  with  N  out¬ 
puts  and  M  -  1  inputs.  These  are  given  by 

:=  yi{k)-yi,j,{k).  (3-7) 


Step  4.  If  M  >  1,  set  Af  <-  Af  -  1,  yi{k)  yi{k),  and  go 
back  to  Step  1,  else  quit. 

It  has  been  shown  in  [6], [8]  that 

yi,io  (k)  =  fHoW^ioik  -1),  (3-8) 


i.e.,  we  have  decomposed  the  observations^at  the  various 
sensors  into  its  independent  components:  yi,joik)  in  (3-8) 
represents  the  contribution  of  z— th  sensor 

achieving  blind  signal  separation.  It  has  been  shown  in 
[6]  that  under  the  conditions  (AS1)-(AS4)  and  no  noise, 
the  proposed  iterative  approach  is  capable  of  blind  identifi¬ 
cation  of  a  MIMO  transfer  function  .F(z)  up  to  a  time-shift, 
a  scaling  and  a  permutation  matrix  provided  that  we  allow 
doubly-infinite  equalizers. 


4.  ADAPTIVE  ALGORITHM 

In  this  section  we  develop  a  stochastic  gradient-based  “re- 
cursification”  of  all  of  the  batch  optimization  steps  dis¬ 
cussed  in  Sec.  3.  We  will  use  the  superscript  (m)  to  denote 
the  various  quantities  pertaining  to  stage  m  of  the  batch  al¬ 
gorithm  of  Sec.  3  (i.e.  m-th  execution  of  Steps  1-4  therein). 
Let  the  length  of  the  equalizer  C{z)  be  Le  and  let 

L,~l 

Ci{z)  =  Y  (4  -  1) 

1=0 

Initialization:  Define 

Yi{k)  =  [yiik)  •••  yi{k-L,  +  l)f,  (4-2) 

Y'‘)(fc)=  [  yi^(fc)  •••  y^(fc)]"'.  (4-3) 

y^^\k)  =  y{k).  (4-4) 

DO  FOR  m=:  l,2,-  .,Af: 

C^’^\k)  =  &^\k  -  1)  +  Ml  Vc.  -  1)) 

(4-5) 

d”')(jb)  =  d’")(fc)/||d’")(Jb)||  (4-6) 

where 

-  (mir’  -  Y(’")‘(fc)}  ,  (4-7) 

=  (1  -  I^MTL)  +  (4  -  8) 

=  (1  -  M2)m«_,)  +  M2e(”‘>=(fc).  (4  -  9) 

=  (1  -  +  /^2|e(”*>(fc)r,  (4  -  10) 

-  2  (4  -  n) 

e^”"\k)  =  d’"^^(fe)Y<”'^(Jt).  (4-12) 

Set 

f'^\k)=  FL’")(A:)e("‘)(fc-n)  (4-13) 

n=  — Lfi 

where  \k)  represents  (cf.  (3-6))  the  contribution  of  the 
extracted  source  at  the  m— th  stage  to  the  measurements 
at  time  A:,  and  where  (n  =  — Li,  —Li  +  1,  •  •  • ,  L2) 

F(-)(jfe)  =  RM(fc)/miT)(fc).  (4  -  14) 

mir’(fc)  =  (1-M3)mi?>(fc-1)  +  M3|e^'"^(fc)f.  (4-15) 

R<ir\k)  =  (1  -  M3)RL’"’(1:  -  1)  +  ^.3y’‘'"\k)e^”'^Uk-n) 

(4-16) 

and 

y(’"+i)(fc)  =  y(-)(fc)  -  f”'\k).  (4  -  17) 

Define 

i^'")(fc)=  [  il<-)(fc)  ...  ^”^\k-L.  +  i)f 

(4-18) 


where  denotes  the  i — th  component  of  Set 

(4-19) 

where 

Y:(’"+^)(fe)  =  y;(”J(fc)  -  Yl”'\k).  (4-20) 


ENDDO 

^(m) 

The  sequence  {y  (k)}  in  (4-13)  represents  the  contri¬ 
bution  of  the  extracted  source  at  the  m— th  stage  to  the 
measurements  at  time  k.  Variable  in  (4-12)  cor¬ 
responds  to  (3-2),  in  (4-5)  corresponds  to  (3-3),  and 

VC^  is  the  instantaneous  gradient,  all  at  time  k  and 
stage  m.  In  (4-5)  fii  is  the  update  step-size  and  in  (4-8)- 
(4-10)  and  (4-15)-(4-16),  ^2  and  /ra,  respectively,  are  the 
forgetting  factors  (>  0,  <  l). 

Running  Cost.  To  monitor  the  convergence  of  the  equal¬ 
izers  in  various  stages  of  the  algorithm,  it  is  useful  to  cal¬ 
culate  a  running  cost  with  the  sign.  Let  denote  the 

running  cost  for  the  m— th  stage  at  time  fc,  given  by 


rim) 


-2 


(4-21) 


where  and  are  computed  as  in  (4-8)-(4- 

10)  but  with  a  smaller  fi2- 

5.  FURTHER  MODIFICATIONS 

5.1.  MMSE  Signal  Separation 

5.1,1.  Non-recursive  Processing 

A  by-product  of  the  solutions  of  Secs.  3  and  4  is  the 
estimates  of  the  system/channel  impulse  response.  These 
estimates  can  be  used  to  design  MMSE  estimators  of 
T^^\z)wi{k)  with  a  controlled  delay  d  to  obtain  an  “op¬ 
timum”  performance  (ignoring  any  effects  of  additive  noise 

on  the  channel  estimates).  Let  F^'^  denote  the  i— th  column 
of  Ft.  We  wish  to  design  a  linear  MMSE  filter  (equalizer) 

of  length  Le  +  1  to  estimate  y^^\k  -  d)  as  y  {k-  d)  given 
y(/)  for  /  =  A;,  A:  -  1,  •  •  • ,  A;  -  Le  +  1  where  d  >  0, 


:=  J^^\z)wj{k)  =  ^Fp)m,(fc-/),  (5-1) 
1=0 

... 

y\k-d):='£,Giy{k-i).  (5-2) 

i=0 

Using  the  orthogonality  principle,  the  desired  solution  is 
given  by 

[  Go  •**  Gx,^— 1  1  [  Hd  Hd— ]'^yy 

(5  —  3) 

where  llyv  denotes  a  [NLe]  X  [NLe]  correlation  matrix  with 
Ryy{j  -  i)  as  its  iy-th  block  element, 


B-yy{p)  :=  E{yit  +  p)y’'(f)}.  :=  Y, 

fc=0 

(5-4) 

In  practice,  we  replace  all  the  unknowns  by  their  estimates. 
Also  we  design  the  equalizer  only  up  to  a  scale  factor  by 
omitting  a^j  from  (5-3). 


Remark  1.  Selection  of  Delay  d:  In  designing  (5-2)  the 
delay  d  was  pre-deter mined.  One  may  choose  to  select  d 
via  exhaustive  optimization  as  detailed  below.  The  MMSE 
when  (5-2)  is  used  can  be  expressed  as 

J{d)  =  trE  {y^^\k  -  d)y^’^^(k  -  d)}  -  j’(d)  (5  -  5) 

where  ^  _ 

J'(d)  :=  o-tj  (5  -  6) 

;=  [  Hd  Hd_i  •  •  •  Hd-z,.  ] .  (5-7) 

Since  the  first  term  on  the  right-side  of  (5-5)  is  independent 
of  d,  minimizing  J{d)  w.r.t.  d  is  equivalent  to  maximizing 

5.1.2.  Adaptive  Implementation 

Note  that  Hyy  does  not  depend  upon  the  stage  m  of  the 
algorithm  of  Sec.  4.  Its  computation  can  easily  be  recursi- 
fied  by  using  the  matrix  inversion  lemma:  see  Table  13.1  on 
p.  569  in  [13].  Denote  the  data-based  adaptive  estimate  of 

llyy  at  time  k  as  ^yy(A:).  Let  h|"”^(A;)  denote  the  estimate 
of  Hi  at  stage  m  and  time  k  of  the  multistage  algorithm  of 
Sec.  4.  Note  that  F[r\k)  in  (4-14)  (see  also  (3-5))  denotes 

an  estimate  of  fIi*^  for  some  t  G  {1,  2,  •  •  • ,  M}  (up  to  a  scale 
factor  and  time  shift).  Therefore,  from  (5-2)  and  (5-5)  we 
obtain  the  adaptive  implementation  at  stage  m;  details  are 
omitted. 

5.2.  Adaptive  Filter  Reinitialization 

In  the  source-iterative  (multistage)  approaches  of  Secs.  3 
and  4,  any  errors  in  cancelling  the  extracted  sources  from 
the  preceding  stages  I  =  1,2,  —  1  affect  the  perfor¬ 

mance  at  stage  m.  The  only  stage  that  is  immune  to  this 
phenomenon  is  stage  m  =  1.  A  possible  solution  to  alle¬ 
viate  this  error  propagation  from  stage-to-stage  is  to  use 
parallel  stages  where  we  still  have  M  stages  for  M  sources 
but  they  all  operate  directly  on  the  given  data  record  in 
parallel  but  with  different  initializations  of  the  equalizers. 
The  problem  here  is  how  to  ensure  that  each  stage  con¬ 
verges  to  a  distinct  source.  Here  we  propose  to  initialize 
the  parallel  stages  using  the  results  of  the  serial  multistage 
implementation  of  Sec.  4  coupled  with  an  MMSE  solution 
similar  to  that  of  Sec.  5.1.  For  stage  m  =  1,  there  are  no 
changes  to  the  algorithm  of  Sec.  4.  For  stages  m  >  2,  run 
the  ^gorithm  of  Sec.  4  till  the  running  cost  (4-21)  reaches 
a  steady- state.  Given  the  estimates  of  the  subchannel  im¬ 
pulse  response  at  stage  m,  we  can  design  an  MMSE  filter 
(in  a  fashion  similar  to  Sec.  5.1.2)  to  estimate  Wj{k  —  d) 
given  y(^)  for  ^  =  A:,  A:  —  1,  •  ♦  • ,  —  Le  +  1*  Let  the  extracted 

wj{k)  at  stage  m  be  denoted  by  w^‘^\k).  Mimickmg  Sec. 

5.1.2,  a  recursive  MMSE  solution  at  stage  m  and  time  k  is 
given  by 


L.-l 

ts<'")(]fe  -  d)  :=  Y  -  *) 

t=0 

where 

\^r\k)  G<rV)  ^^Uk)] 

=  [  F(/‘)”(jfe)  •  •  •  F^’"^^(jb)  0  ■  ■  ■  0  ]  Tyy{k).  ^ 

At  stage  m  and  time  k,  w^^\k  —  d)  is  an  MMSE  estimate 
(with  delay  d)  of  e^”^^(A:)  for  the  parallel  implementation. 
Note  that  C(z)  =  is  the  desired  MMSE 

initializer. 


6.  SIMULATION  EXAMPLE 

Take  N~Z  and  M^2  in  (2-2)  with 


0.2-f-0,8z^i  -(-0.4Z-2 
0.3z“^  -  0.6z-2 
0. 


^2)(z) 


0.5  -  O.Sz-i 

-0.21z“^  -  0.5z“2  ^  0.72Z--®  +  0.36z-'‘  +  0.21z~® 
0. 


Fig.1. 


The  input  {iyi(/:)}  is  an  i.i.d.  complex  Gaussian- mixture 
with  4th  norrnalized  cumulant  as  0.7433  .  The  input 
{1472(^5)}  is  an  i.i.d.  4-QAM  sequence  with  4th  normalized 
cumulant  as  “1.  The  additive  noise  is  white,  complex  Gaus¬ 
sian.  The  powers  of  {ti;j(A;)}  were  scaled  so  as  to  have 
i:{||:F(i)(z),i,i(fc)||2}  =  i;{||:F<*)(z)«,2(fc)|l*}.  The  perfor- 
mance  measure  was  taken  to  be  the  signai-to-interference- 
and-noise  ratio  (SINK)  per  source  signal,  defined  as 


SINRj  = 


£{||y(^)(fc)  -  Sy 


(6-1) 


where  a  is  that  value  of  the  scalar  a  which  minimizes 

^(^)ir}-  The  length  of  the  inverse  fil¬ 
ters  was  11  samples  per  sensor  (output)  for  the  approach 
of  Sec.  4.  The  initial  guess  for  the  tap  gains  was:  set 
Ci(5)  =  1  for  i  =  m  for  the  m—th  stage  equalizer  (m  =1,2) 
with  the  remaining  tap  gains  set  to  zero.  The  algorithm 
step  sizes  and  forgetting  factors  for  each  stage  m  were  cho¬ 
sen  as:  fii  =  0.0005,  fi2  =  0.015  and  =  0.0005  when 

<  0  (see  (4-11)),  and  fn  =  0.00025,  ^2  =  0.0075  and 
=  0.0005  when  7^^^  >  0.  For  the  running  cost  (4-21) 
computation  we  selected  “^2”  =0.002  in  (4-8)-(4-10).  The 
parameters  Lx  and  L2  in  (4-1^  were  selected  as  =15 
and  Zf2  =  6.  To  design  the  MMSfe  equalizers/filters  we  took 
Xe  =  11  and  d  was  optimized  following  Remark  1  of  Sec. 
5.1.1  over  the  range  [—15,6]. 

Fig.  1  shows  the  evolution  of  the  average  running  cost 
(see  (4-21)),  averaged  over  100  Monte  Carlo  runs  (af¬ 
ter  ‘assigning’  each  equalizer  cost  to  its  corresponding  ex¬ 
tracted  source)  without  using  any  filter  reinitialization.  Fig. 

2  shows  when  reinitigdization  (after  12000  samples) 

of  Sec.  5.2  is  used.  It  turns  out  that  source  1  (ti;i(A:))  is 
extracted  first,  so  that  reinitialization  only  affects  source 
2  (4-QAM).  Table  I  shows  the  average  SINR  (based  on 
100  runs)  for  the  two  sources  at  the  end  of  the  run  (i.e. 
at  k  =  18000)  without  and  with  filter  reinitialization,  for 


various  SNR’s.  The  SINR’s  were  computed  using  the  so¬ 
lution  (4-13)  as  well  as  the  MMSE  solution  of  Sec.  5.1.2. 
It  is  seen  that  blind  signal  separation  benefits  from  both, 
MMSE  signal  separation  as  well  as  filter  reinitialization. 


Fig.  2. 
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ABSTRACT 

In  several  communications  (and  related)  applications  the 
underlying  equivalent  discrete-time  mathematical  model  is 
that  of  a  multiple-input  multiple-output  HVIIMO^  linear  sys¬ 
tem  where  the  number  of  inputs  equals  the  numoer  of  users 
(sources)  and  the  number  of  outputs  is  related  to  the  num¬ 
ber  of  sensors  and  the  sampling  rate.  The  vector  input 
sequence  represents  the  information  sequences  of  the  vari¬ 
ous  users.  Existence  of  Unite-length  multi-step  (including 
one-step)  linear  predictors  plays  a  key  role  in  blind  identi- 
hcation  and  equalization  of  multiple-input  multiple-output 
(MIMO)  systems.  In  this  paper  we  first  derive  an  upper 
bound  on  the  length  of  a  linear  predictor  for  MIMO  sys¬ 
tems  with  irreducible  transfer  functions.  Then  multi-step 
linear  predictors  for  IIR/FIR  MIMO  channels  are  consid¬ 
ered.  An  upper  bound  on  the  length  of  the  one-step  pre¬ 
dictor  is  known  for  the  case  when  the  underlying  MIMO 
transfer  function  is  irreducible  and  column-reduced.  When 
the  MIMO  transfer  function  is  irreducible  but  not  necessar¬ 
ily  column-reduced,  it  is  known  that  a  finite-length  linear 
predictor  exists;  however,  its  length  has  not  been  previ¬ 
ously  specified  in  the  literature.  In  past  multi-step  linear 
predictors  have  been  considered  in  the  literature  only  for 
single-input  multiple-output  models. 


1.  INTRODUCTION 

Consider  a  discrete-time  HR  MIMO  system  with  N  outputs 
and  M  inputs: 

y(fc)  =  T{z)w(k)  +  Ti{k)  =  s(fc)  +  n(fc)  (1) 

where  y(fe)  =  [yi(fc) :  yzQc)  :  ■  :yw(A:)]’’,  simUarly  for  w(fc), 

s(fc)  and  n(fc),  2:  is  the  transform  variable  as  weU  as  the 

backward-shift  operator  (i.e.,  z~^v/{k)  =  w(i  —  1),  etc.), 
s{k)  is  the  noise-free  output,  n{k)  is  the  additive  measure¬ 
ment  noise  and  the  N  X  M  matrix  .F(z)  is  given  by 

J^{z)  -  A-\z)B{z) 

where 

Tia 

^(z)  =  7-f  ^  Aiz“'  and  B(z)  =  (^) 

We  allow  all  of  the  above  variables  to  be  complex- valued. 
The  following  assumptions  are  made  regarding  (l)-(2): 

(HI)  N>  M, 

This  work  was  supported  by  the  Office  of  Naval  Research 
under  Grant  N00014-97-1-0822. 


(H2)  Rank{iB(z)}  —  M  Wz  including  z  =  oo  but  exclud¬ 
ing  z  =  0,  i.e.,  B{z)  is  irreducible  [5,  Sec.  6.3]. 

(H3)  Unobserved  input  sequence  {w(A:)}  is  zero-mean, 
white.  Take  E{w{k)w^{k)}  =  Im  by  absorb¬ 
ing  any  non-identity  correlation  of  u;(A;)  into  B{z) 
where  Im  is  the  M  x  M  identity  matrix  and  the 
superscript  H  is  the  Hermitian  operator  (complex 
conjugate  transpose). 

(H4)  {n(A;)}  is  zero-mean  with  E{n{k  -|-  r)n^(fc)}  = 

o'n-^jv^('r). 

(H5)  A{z)  0  for  |z|  >  1. 

Notations  and  Definitions;  Let  B^^^{z)  denote  the 
l“th  column  of  B{z)  such  that  5^^^(z)  =  * 

where  Li  =  deg  (5^^^(z))  =  lowest  degree  of  the  polynomial 
B^^\z).  By  (2),  Li  <  Ub  V/.  The  polynomial  matrix  B{z)  is 

said  to  be  column-reduced  if  rank  ==  M 

[5].  Consider  the  Hilbert  space  H  of  square  integrable 
complex  random  variables  on  a  common  probability  space 
endowed  with  the  inner  product  (for  scaler  complex  ran¬ 
dom  variables  and  2:2)  <  xi,Z2  >=  where  the 

superscript  +  denotes  complex  conjugation  (see  [4]V  Let 
Sp{xi  E  1}  denote  the  subspace  of  7i  generated  oy  the 
random  variables/ vectors  in  the  set  {li  E  I}>  Given  an 
W— variate  s(fc)  with  i— th  component  si(A;),  define  the  sub¬ 
space 

Rfc— m;Z/i ,Jy2 r" * 

Sp{  3i{k  —  li),  7n  <  li  <  Li]  i  =  1, 2,  •  •  • ,  N}. 

We  will  use  Rfc-m(s)  to  denote  Hfc-m;oo,-  -,oo(s).  Let 
(s(Jb)lFfc-i(s))  denote  the  orthogonal  projection  of  s{k) 
onto  the  subspace  [4].  ^ 

Models  such  as  (l)-(2)  with  ^{z)  =  B{z)  arise  in  several 
useful  digital  communications  and  other  applications  [1]- 
[3],  [6]-[8]  where  one  of  the  objectives  is  to  estimate  the 
multichannel  impulse  response  {Bi}  and/or  to  recover  the 
inputs  w(fc)  given  the  noisy  measurements  but  not  given 
the  Imowleage  of  the  system  transfer  function.  One  of  the 
popular  approaches  is  that  using  linear  prediction  [l]-[3] 
where  existence  of  finite-length  one-step  linear  predictors 
plays  a  key  role.  In  the  MIMO  case  it  is  known  that  under 
(H1)-(H3),  finite-length  one-step  linear  predictors  exist  for 
the  process  s(fc)  [6],[7].  The  length  (or  an  upper-bound  on 
it)  has  not  been  specified  in  [6], [7].  Under  an  additional 
condition  that  B{z)  is  column-reduced,  it  is  stated  in  [1] 
that  there  exists  a  linear  predictor  (for  s(fc))  of  length  no 

longer  than 

A  one-step  linear  prediction-based  approach  was  first  pro¬ 
posed  in  [12]  and  later  expanded  upon  in  [2].  Unlike  the 
subspace-based  methods  of  [13],  [14]  and  others  (see  also 


N 


[3]  and  references  therein),  the  linear  prediction  (LP)  based 
approach  of  [12]  and  [2]  turns  out  to  be  rather  insensitive 
to  the  order  of  the  underlying  FIR  channel  (so  long  as  one 
overiits).  More  recently,  it  has  been  pointed  out  in  [15] 
and  [16]  that  the  LP-based  approach  can  be  further  signif¬ 
icantly  improved  by  utilizing  some  additional  information 
not  exploited  by  LP.  Although  [15]  and  [16]  derive  their 
algorithms  in  a  quite  a  different  manner,  their  final  algo¬ 
rithms  are  essentially  the  same.  In  this  paper  we  will  follow 
the  approach  of  [16]  which  is  based  upon  rnulti-step  linear 
prediction.  Unlike  [16]  we  allow  multiple  inputs  and  HR 
channels.  Unlike  [15]  we  allow  MIMO  transfer  functions 
that  are  not  column-reduced  and  we  also  allow  UR  chan¬ 
nels. 

2.  FINITE-LENGTH  ONE-STEP  LINEAR 
PREDICTORS  FOR  T{Z)  =  B{Z) 

By  [5,  Sec.  6.3]  there  exists  an  M  X  M  unimodular  matrix 
W{z)  such  that 

S(z)  =  B{z)W(z)  (3) 

where  B(z)  is  column-reduced  and  W{z)  is  unimodular  (i.e. 

'  '  _ /n 

det(>V(z))  =  constant).  Let  W  '{z)  =  i-th  column  of 
B{z),  Li  =  deg  and  B<‘^(z)  = 

=  deg  {}V{z)y  Then 

p 

W~^{z)  =  where  —  (4) 

ij:;0 


Define  the  [NK]  X  [AT  +  Li]  generalized  Sylvester  matrix  (a 
Toeplitz  matrix) 


■ 

• 

0 

0 

0 

B<o‘'^  • 

•• 

^  ■ 

0 

0 

0 

..  B<o’'^ 

• 

Further  define  the  [NK]  x  [M K  -f  Li]  matrix 

MB)  ~  [ 

Set  x(A;)  :=  W(z)w(A;)  so  that  s(/:)  =  B{z)x{k).  Then  we 
SK{k)  -  rK(B)XK{k)  (7) 


where  SK{k)  :=  [s^(fc)!  •  •  •  Is^(A:  —  AT  -j-  1)]^  and  Xjc(fc)  •— 
[xi{k): .  •  •  \xi{k-Li  -  AT-h  1):x2(A:):  •  •  •  ‘Mk  -L2-K^l) 


I  •  ■  •  *  •  •  '^M{k  —  Lm  —  K  ^  1)]^ 

component  of  x(fc)).  It  is  known  [9]  (see  also  H,[8])  that 
rank{7ic-(^)}  ^  MK  Li  if  A"  >  Li.  There¬ 

fore,  a  left  inverse  to  Tk{B)  exists.  Hence,  if  A  >  X)i=i 
we  have 

Afc;K-i,K-i,.-,ic-i(s)  =  W 

(8) 

Since  K{k)  =  >V(z)w(fc),  using  (4)  it  follows  that  w(fc)  = 
Wix{k  -  i).  Therefore,  for  some  Cm’s,  we  have 


Tib 

s(fclfc-l)  :=  ^ Biw(fc-i)  =  XI 


t=l 


P 

W'ix(fc  —  t  —  1) 


Li=o 


p+^b 

=  ^CmX(A;-m).  (9) 

m=l 

Define  Ao  :=  max  Li,  Ub -f  p}.  If  A  > 

Ao,  then  s{k\k  -  1)  G  Afc_i;p+Tib,  •,p+nb(x)  C 
^fc-i;K+L,,...K+r^(^)  =  Hk-i,K,..,Kisy  Therefore,  there 

exist  Ai’s  such  that 


?(A:1A;  -  1)  =  -  Ais(fc  -  i). 


(10) 


Using  (1),  (2)  and  (9),  we  have 

s{k)  =  s(fc|A;-l)+e(A:lfc-l)  where  e{k\k-l)  Bo'W^k)^ 

By  (H3)  it  follows  that  E{e(fc|A:  — l)s^(A:— m)}  —  0  Vm  >  1. 
Hence,  by  the  orthogonal  projection  theorem  (OPT)  [4],  we 
have  ^k\k  -  0  -  {s{k)\Hk-i{s)).  But  by  (10)  and  OPT, 
we  also  have  s(k\k  —  1)  =  (s(fc)|  Ak-ij/Co,- •,K‘o(s))*  Thus, 
e(k\k  —  1)  is  the  linear  innovations  process  of  {s(fc)}  [41.  It 
remains  to  ‘simplify*  Aq.  By  (3)  and  [5,  Thm.  6.3-13J  (the 
predictable-degree  property  of  column-reduced  matrices), 
it  follows  that  L^^^  <  rib  and  Li  <  nt  (1  <  /  <  Jlf). 
Therefore,  p  <  (M  —  l)7ib  and  —  Lfnt.  Hence, 

Ao  <  Mub.  The  above  discussion  is  summarized  below. 


Theorem  1. 


Under  (Hl)-(H3),  there  exists  an  inte¬ 
ger  A  <  Mnb  and  a  polynomial  matrix  A(z)  r=  Jjv  -j- 
Aiz"'  of  degree  A  such  that  A(z)s(k)  =  e{k\k  - 
1)  =  Bow(/:),  The  linear  innovations  process  of  {s(fc)}  is 
{e{k\k  -  1)}.  • 

It  follows  from  (1)  and  Theorem  1  that 

:4(z)s(A:)  =  A{z)B{z)yv{k)  =  Bow(k).  (12) 
Since  w(fc)  is  full- rank  and  white,  it  follows  that 

A(z)B(z)  =  Bo  =?■  (Bf  Bo)  Bf  :4(z)B(z)  =  Im- 

Clearly  the  M  X  N  polynomial  matrix  Q{z)  := 
B^^.4(z)  is  of  degree  A  <  Mrib  and  it  is  a 
left  inverse  to  B(z). 

I  Theorem  2.  Under  (Hl)-(H3),  there  exists  an  integer 


K  <  Mub  and  a  polynomial  matrix  Q{z)  =  Giz  '  of 

degree  A  such  that  5(z)B(z)  =  /m •  • 

In  [8]  we  derived  the  upper  bound  on  deg(5(z))  as  {2M  — 
1)716  -  1.  Clearly  Theorem  2  offers  a  better  bound  for  M  > 
2. 

3.  FINITE-LENGTH  MULTI-STEP  LINEAR 
PREDICTORS 

We  now  treat  the  general  case  ^(z)  =  A  ^(z)B(z),  We 
have 

£30 

(14) 


s(fc)  =  -X-^«®(*=  ~  0 


and 


It  then  follows  from  (15)  that 


ria, 

s(fc)  =  -^A,-s(fc-i)-Ai 


t=2 


»va 

—  A»s(fc  —  1  —  t) 


+  ^Biw(A;  -1-i) 


i=o 

Tla  +  l 


+ -i) 


Tifc  +  l 


=  -  ^  A2is(A:  -  i)  +  ^  B2<w(fc  -  i).  (16) 


i=0 


for  some  appropriate  choices  of  the  parameters  (matrices^ 
A2i’s  and  B2i’s.  Now  substitute  for  s{k  —  2)  using  (15) 
in  (16),  and  continuing  this  way,  we  have,  in  general,  for 
appropriate  choices  of  Ai;’s  and  Bh’s  {I  >1) 


1 


s{k)  ~  ^  Aits(A:  -  i)  +  ^  Bitw(fc  -  t).  (17) 

i-O 

Both  (15)  and  (17)  represent  the  same  signal/system  and 
therefore,  they  must  have  the  same  impulse  response.  By 
(14),  (15)  and  (17),  it  then  follows  that 

Bit  =  Ft  for  0  <i  <  /  -  1. 

Let  us  rewrite  (17)  as 

s(ifc)  =  e(ib|it  •~0+s(fc|A;-^i) 

where 

i-i  i-i 

e{k\k  -1):='^  Bnw(fe  -  t)  =  ^  Fiw(fc  -  i)  (20) 


t=0 


t=:0 


and 


Tla+l  — 1  Tl.j,+i  — 1 

s(fc|fc-i):=-  Aiis(fc-i)+  BKw(fe-i)-  (21) 

i=l  i=I 


Theorem  3.  Under  (Hl)-(H3),  (H5),  and  for  I  = 


1,2,’-  *,  {s(A:)}  can  be  decomposed  as  in  (19)  such  that 

F{e(jl:|fe  -  0s"(fc  -  m)}  =  0  Vm  >  I,  (22) 
s(fc|fc  -  /)  =  (s(fc)|JTfc_,(s))  .  (23) 

s(/s|^  0  ^  — i;na+ !,•••  ,n«  +  Afrib+t — 1(®)  (24) 


and 


a{k\k  -  1) 

=  (s(A;)|5’fc-I;n»+iUrni,+l-l,-,ntt+Mnb+I-l(s))  • 

Proof:  By  (1),  (2),  (14)  and  (H5),  we  have 


s{k)  =  Ftw(A;  -  z). 


By  Theorem  2,  it  follows  that 


Gis(fc — i)  =  w(fc). 


(25) 

(26) 

(27) 


Substituting  for  w(fc)  from  (27)  in  (21),  it  follows  that 

^k\k-l)  e  Hk-i{s).  (28) 

By  (26)  and  (H3),  we  have 

£?{w(A:)s'*^(fc  —  m)}  =  0  Vm>0.  (29) 

Therefore,  using  (20)  and  (29),  it  follows  that  (22)  is  true. 
By  (19),  (22),  (28)  and  the  orthogonal  projection  theorem 
[4],  it  follows  that  (23)  is  true  (as  the  “error”  e(/:|/j  —  /) 
is  orthogonal  to  the  data  s(fc  —  m)  (m  >  /),  hence  to  the 
subspace  iffc_-i(s)). 

It  remains  to  establish  (24)  and  (25).  Define 
s(A;)  :=  v4(2:)s(A;)  =  B(z)'w(k) 

=  ^(^)[>V(^)w(A:)]  =  ^z)w{k)  (30) 

where  we  have  used  (3).  Using  (30)  and  rewriting  (8)  in  the 
notation  of  Sec.  3,  if  iiT  >  Tit,  we  have 

iffciK-i,  -.Kr-iCs)  =  W* 

It  also  follows  from  (30  )  that 

C  ^ffc;n.+L,-.n.  +  J.(s)  VL  >  0.  (32) 


(18) 

Therefore,  for  K  > 

(19) 

^  ■S^fc;Tia,  +  K‘~l,--,na+-Kr- 

and  in  general,  for  any  i  >  0, 

(33) 

■^/e -I ;  JC+Ti  +  i-i  ,•  •  • ,  JC+r^f  +  i  - 1 

(20) 

(34) 

As  in  (9)  we  have 

Tlfc  +  I— 1 


p+Tlb  +  t-l 


y^  Biiw(A;  —  t)  =  Cix(fc  -  i)  (35) 


for  some  Cm*s.  Therefore,  it  follows  that 


nt  +  I-l 


y^  Bitw(A;  -  i)  E  Hk  — I;p+TH,  +  l— +  (x)  (36) 


i=o 


i=l 

^  +  (37) 

C  iffc-I;n^+JC+l-l,  -.n»+iC+i-l(s)  (38) 

where,  as  in  Sec.  2,  Kq  :=  max  Xt,  7X6  +  p| .  It 

therefore  follows  from  (21)  and  (38)  that 

s{k\k  —  1)  E  ^Tfc-ljna+K'+l-l,  •,na+K-+I-l(s)  V  if  > 

As  in  Sec.  2,  Kq  <  Mnt.  If  we  pick  K  =  Mub  in  (39),  we 
obtain  (24).  Finally,  (25)  follows  from  (19),  (22),  (24)  and 
the  orthogonal  projection  theorem  [4].  This  completes  the 
proof  of  Theorem  3.  □ 

It  follows  from  Theorem  3  that 

^(AjIA;  — /)  =  —  Aits(/c-i)  where  Li  >  7ia  +  M7ib  +  /  — 1, 

(40) 


for  some  N  x  N  matrices  A^s.  By  (19)  and  (22)  (recall  also 
the  orthogonal  projection  theorem),  we  have 

s{k\k-l)  =  arg{minx(fc)g/r^_,(s)-E{l|s(*:)-x(A:)|p}. 

Therefore,  s(/:|A:  —  /)  is  the  /—step  (ahead)  linear  predictor  of 
'  *  *^8(771),  m  <  k  —  1}.  By  (25^  it  is  also  the  /—step 

:ar  predictor  of  s(A:)  given  {s(m),  k  —  Li  <m  < 

from  (19)  and  (40)  that 


s{k)  given  \ 
^ahead)  line 

It  follows 


similarly  TlnriLi  pertaining  to  the  additive  noise.  Carry 
out  an  eigendecomposition  of  Then  the  smallest 

N  —  M  eigenvalues  oiHyyLi  equal  crj  because  under  (Hi)- 
(H3)  and  (H5),  rank{('feaai,^}  <  NLx  -f  M  whereas  under 
(H4),  rank{72.nnLi }  =  NLi-\-N  =  rank{72.yyi:,i}.  Thus 
a  consistent  estimate  of  cr^  is  obtained  by  taking  it  as 

the  average  of  the  smallest  N  —  M  eigenvalues  of  ItyyL^y 
the  data-based  consistent  estimate  oi'R.yyhi  -  The  noise-free 
signal  correlation  function  can  then  be  estimated  from  the 
noisy-data  correlations. 


Li  l-l 

e{k\k-l)  :=s(fc)  +  ^A,<s(i-i)  =  ^Fiw(fc-i).  (42) 

t=l  t=0 

It  follows  from  (42)  that  for  /  >  2, 

edAk)  :=  e{k\k-l)~-e{k\k-l-\rl)  =  Fi_iw(A;  -  /  +  1). 

(43) 

Define  a  [iV'  x  D]— vector  (D  >  1) 


r  e(ik|fc--l)  1 

Fo  -1 

S(fc)  := 

Gd,7{k  +  1) 

Fi 

-  ed,D{k  +  D  —  1)  . 

.  Fzj-i  . 

(44) 

Following  the  SIMO  FIR  channel  results  of  [16],  (44)  can 
be  used  to  to  estimate  the  MIMO  channel  impulse  response 
Fi  for  i  =  0, 1,  •  •  • ,  D  —  1  (for  arbitrary  D)  up  to  a  uni¬ 
tary  matrix  ([!])•  [This  unitary  matrix  requires  higher- 
order  statistics  for  its  estimation  [1].]  Knowledge  of  Ft  for 
i  =  0,1,*-*,D  —  1  can  be  used  to  design  MMSE  equal¬ 
izer  with  lag  D  —  1  [10].  All  of  this  relies  on  (H4j  which 
allows  determination  of  the  noise  variance  from  data  by 
eigendecomposition  of  the  data  correlation  matrix  (which 
is  discussed  next). 

4.  ESTIMATION  OF  NOISE  VARIANCE 

In  practice,  we  have  noisy  measurements  y{k)  of  s{k) 
whereas  the  preceding  discussion  and  results  are  based  upon 
availability  of  the  correlation  function  of  s{k).  Lemma  1 
below  is  useful  in  this  regard.  Consider  the  case  of  /  =  1 
(one-step  prediction).  By  (42)  we  have 


s{k)  =  -^Ans(i:-i)  +  Fow(A;).  (45) 

t=l 


If  Li  >  Tia  +  Mub,  then  A{  =  0  for  i  >  na  +  Mub  by  virtue 
of  (25). 


Lemma  1. 


_ Under  (Hl)-(H3)  and  (H5),  rank{7^s5Ll}  < 

NLi  -f  M  for  Li  >  +  Mub  where  UsaLi  is  [N{Li  -{- 

1)]  X  [J\r(Li  -f-  1)]  with  its  ij—ih  block-element  given  by 

R„(i  -  i)  :=  E{s{k  +  i  -  t)s^(A:)}.  • 

Sketch  of  proof :  It  follows  from  (45)  and  the  fact  Fo  =  Bo 


that 


[  Jjvr  Ai  Az,j  ]  T^ssLi  —  [  0  •••  0  ] 

(46) 

Clearly  rank{[  7jvr  Ai  •••  Ali  ]}  =  JV.  By  (H2), 

rank{Bo}  =  M  =  rank{BoB^}.  The  desired  result  then 
follows  from  (46)  and  the  Sylvester’s  inequality  [5,  p.  655]. 
□ 

In  a  fashion  similar  to  “RssLi  in  Lemma  1,  let  IZyyLi 
denote  a  [N{Li  +  1)]  x  [N{Li  +  1)]  matrix  with  its  ij-ih 
block  element  as  Kyy{j  —  i)  =  B{y{k-\- j  --i)y^{k)}\  define 
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